SimpleWebKit

From GNUstepWiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

SimpleWebKit

Background Information

  • originated in mySTEP
  • is completely written in Objective-C (1.0) so that it can be compiled on any system, even with gcc 2.95.3
  • aims at providing the most popular documented methods of Full WebKit for the classes WebView, WebFrame, WebDataSource, etc.
  • aims at rendering (X)HTML as well as possible (but not perfectly)
  • uses NSAttributedStrings passed into NSTextView as the rendering backend
  • already displays many pages
  • is used in the Vespucci.app Web Browser application for GNUstep

Source Code

svn co http://svn.gna.org/svn/gnustep/libs/simplewebkit/trunk

svn+ssh://user@svn.gna.org/svn/gnustep/libs/simplewebkit/trunk

http://svn.gna.org/viewcvs/gnustep/libs/simplewebkit/trunk

How to compile

 mkdir SWK
 cd SWK
 svn co http://svn.gna.org/svn/gnustep/libs/simplewebkit/trunk
 cd trunk
 make

Status

Features of the Subversion (SVN) trunk code:

  • parses (X)HTML into a DOM tree
  • renders approx. 98% of the HTML 4.0 tags in a reasonable way (e.g. < font color="#667788">, < center>, < h2>, works)
  • makes <a> links clickable and processes them
  • loads <img>
  • loads <script> etc. asynchronously i.e. loads them as subresources
  • loads CSS etc. asynchronously i.e. loads them as subresources
  • handles <frame>
  • handles <form>
  • handles < table > (if the NSTextView backend can handle)
  • has an ECMAScript engine that parses 90% of the syntax and evaluates expressions (missing are Statements and the native Objects including "document", "window", "event" etc.)

Missing:

  • properly compile and run <script>
  • completion of the ECMAScript engine
  • properly parse and apply CSS

SWK Browser

SWK Browser is part of the SimpleWebKit project and more or less a test bed for it, although it has most features of a full browser. These are

  • multiple documents
  • shows (X)HTML, Images etc.
  • can show page source code
  • can show activities (i.e. subresources)
  • can show a DOM Tree inspector
  • has a JavaScript console

We have compiled Simple WebKit and SWK Browser with Cocoa so that it runs natively on a Mac.

SWK Browser has been ported to GNUstep as well and runs using NIB loading without any additional code change.

Download it here: [1]

Screenshots

Some first screen shots (made from SWK Browser on a Mac)

File:SimpleWebKit Example 1.png

How it Works

1. the WebView

  • is the master view object and there is only one per browser (or browser tab)
  • it holds the mainFrame which represents either the normal <body> or the top level <frame> or <frameset>
  • if there is a <frameset> hierarchy, there are additional child WebFrames

2. the WebFrame

  • is repsonsible for loading and rendering content from a specific URL
  • it uses a WebDataSource to trigger loading and get callbacks
  • it is also the owner of the DOMDocument tree
  • JavaScript statements are evaluated in a frame context
  • it is also the target of user clicks on links since it knows the base URL (through the WebDataSource)

3. the WebDataSource

  • is responsible for loading data from an URL
  • it may cache data and handle/synchronize loading fo subresources (e.g. for an embedded <img> tag)
  • it translates the request and the response URLs
  • it provides an estimated content length (for a progress indicator) and the MIMEType of the incoming data stream
  • as soon as the header comes in a WebDocomentRepresentation is created and incoming segments are notified
  • it also collects the incoming data, so that a WebDocumentRepresentation can handle either segments or the collected data

4. the WebDocumentRepresentation(s)

  • there is one for each MIME type (the WebView provides a mapping database)
  • it is responsible for parsing the incoming data stream (either completely when finished, or partially)
  • and provide a better suitable representation, e.g. an NSImage or a DOMHTMLTree
  • finally, it creates a WebDocumentView as the child of the WebView and attaches it to the WebFrame as the -webFrameView
  • so, if you want to handle an additional MIME type, write a class that conforms to the WebDocumentRepresentation protocol

5. NSXMLParser

  • a private variant is used that adds a stalling mechanism and selection of the file encoding
  • recognizes the <?xml> tag for XHTML and has a lazy mode for pure HTML
  • it also has an Entity table that translates the HTML entities to character strings

6. the DOMHTMLTree

  • is used only for HTML content
  • is built in WebHTMLDocumentRepresentation by parsing segment of HTML data coming in
  • any change in the DOMHTMLTree is notified to the WebDocumentView (or one of its subviews) by setNeedsLayout
  • each class of potential DOMHTMLTree records has methods to denote how to handle tags
  • the tag to class mapping is table driven

7. the WebDocumentView(s) an its subviews

  • are responsible for displaying the contents of its WebDataRepresentation
  • either HTML, Images, PDF or whatever (e.g. SVG, XML, ...)
  • they gets notified about changes either by updates of the WebDataSource (-dadaSourceUpdated:) or directly (-setNeedsLayout:)
  • if one needs layout, it must go to the DOM Tree to find out what has changed and update its size, content, children, layout etc.
  • this is a little tricky/risky since the -layout method is called within -drawRect: - so changing e.g. the View frame is very critical and may result in drawing glitches
  • for HTML, we do a simple trick: the WebDocumentView is an NSTextView and the DOMHTMLTree objects can be traversed to return an attributedString with embedded Tables and NSTextAttachments

8. the JavaScript engine

  • is programmed according to the specificaion of [[2] ECMA-262]
  • uses a simple recursive stateless parser (could be optimized in stack useage and speed by a state-table driven approach)
  • parses the script into a Tree representation in a first step
  • then, evaluates the expressions and statements according to the current environement
  • this allows to store scripts in translated form and reevaluate them when needed (e.g. on mouse events)
  • uses Foundation for basic types (string, number, boolean, null)
  • uses WebScriptObject as the base Object representation
  • DOMObjects are a subclass of WebScriptObjects and therefore provide bridging, so that changing a DOMHTML tree element through JavaScript automatically triggers the appropriate WebDocumentView update notification

9. the CSS engine

  • CSS style sheets are translated into a DOMCSS tree
  • @import ed sheets are loaded as needed
  • the CSS is not yet applied to the HTML -> NSAttributedString translation

Contact

Author: Nikolaus Schaller QuantumSTEP: http://www.quantum-step.com