Acorn Browse Plug-In Handling Description ========================================= * Document status --------------- Distribution: General Release Title: Acorn Browse Plug-In Handling Description Drawing number: 1216,210/T Issue: 1 Author(s): Andrew Hodgkinson Date: 19/06/98 Revision: 1.00 Change number: N/A Last issue: N/A * Contents -------- i. Document Status ii. Issue / revision History iii. Overview iv. The Preprocessor v. The generic OBJECT handler vi. Plug-in queue code vii. Streams viii. Other issues ix. References 1. Acorn Plug-In Protocol Functional Specification 2. Wimp message protocol 3. Director Player Software Functional Specification 4. Java Software Functional Specification An HTML version of this document may be found at: http://www.acorn.com/browser/plug-in_handling/descdoc.html * Issue / revision history ------------------------ Issue 1 (General release) 1.00 19/06/98 Created in response to several queries regarding the behaviour of Browse when handling plug-ins. * Overview -------- When Browse fetches data, it passes it through the HTML parsing library. The HTML parser returns structures Browse can better deal with before they actually get formatted into a displayable form. Image fetches, frames layout and so-on all happen at this early stage; so does dealing with EMBED, OBJECT or APPLET elements. This is a description of what Browse does when it encounters those elements. * The Preprocessor ---------------- First, the preprocessor - the thing that comes after the HTML parsing but before the page layout - examines the parser data and discovers an EMBED or OBJECT tag. It passes it to the generic OBJECT handler. This is possible because of the way the parser organises the data for us: * For OBJECTs, it thinks of the DATA attribute as 'data'. It thinks of TYPE as 'type', 'CODETYPE' as 'codetype' and CODEBASE as 'codebase'. If there's a CLASSID specifier it notes this too, but not immediately (see below). All PARAM elements are converted to individual internal structures. * For EMBEDs, there is no TYPE, CODETYPE, CODEBASE or CLASSID. It thinks of the SRC attribute as 'data', and all other attributes are each converted into name/value pairs that look like PARAM elements. * For APPLETs, there is no TYPE or CODETYPE. It thinks of the CODE attribute as 'CLASSID' and the CODEBASE as 'codebase'. All PARAM elements are converted to individual internal structures. So, the browser makes EMBED and APPLET look like OBJECT plus PARAM elements using the mapping described above. In the description below, the names 'data', 'type' etc. will be used; the mappings just given tell you what actual HTML attributes these values refer to. Note: * IMG elements are *never* considered to lead to plug-ins. So, if you were to try and put a Shockwave movie on a page with an IMG tag, say, the movie would be passed to the image processing library which would not recognise the data - a placeholder would be displayed. To embed data that Browse doesn't understand, you must use EMBED or preferably OBJECT (after all, the W3C don't currently recognise EMBED as an HTML element). * The generic OBJECT handler -------------------------- Browse first examines 'data' and 'type'. If 'data' is set, if looks at 'type', and tries to get a filetype from this using the MimeMap file. It'll default to FileType_DATA if it can't find anything. If 'type' is not set, it'll look for a filename extension on the 'data' attribute and again tries to get a filetype from the MimeMap file, defaulting to FileType_DATA if nothing is found (hence this route would always be taken for ). Then, it checks for: * FileType_PNG * FileType_GIF * FileType_JPEG * FileType_TIFF * FileType_XBM * FileType_BMP If the filetype discovered from 'type' or maybe 'data' matches any of the above, the item is treated as an inline image. The function dealing with OBJECT or OBJECT-like elements exits here and the browser image handling code takes over. Note: * If 'data' is not set, Browse now sets 'data' to the value of the CLASSID attribute (which may also not be set of course, for example in EMBEDs - even if there were a CLASSID attribute specified for some reason, it'll have been made to look like a PARAM tag by the HTML parser as only the SRC attribute is considered "part of" the actual EMBED). By this point, then, the item will be not handled as an inline image because 'data' wasn't set, in which case the CLASSID attribute's value is read into 'data', or because the filetype deduced as described above did not match one that Browse knows about. We now check again - is 'data' set? (Remember, Browse could be checking for DATA, CLASSID, or SRC attributes on OBJECT or EMBED elements by now). If it is, a special case of 'clsid:' - an Active-X component - is checked for; if this is present, the item is marked not handleable. Otherwise, look at 'codetype'. If unset, try and find a filename extension on 'data'. If Browse can't find either, the item is marked not handleable. Otherwise, it tries to get a filetype from the MimeMap file and 'codetype' or the extension on 'data' (in that order). Should it find a filetype in the file, it'll construct a system variable name of Alias$@PlugInType_XXX where 'XXX' is the hex equivalent of the found filetype. If however it cannot get a filetype from the MimeMap file, the item is again marked not handleable. Should Browse have constructed the system variable as described, it checks to see if it exists. If the system variable does not exist, Browse assumes that no plug-in is available for the data type the OBJECT or EMBED element is specifying and marks the item as not handleable. By now, we'll either have identified what plug-in to use or marked the item not handleable - in this latter case, Browse will see if there is an stream of alternative HTML that it can display. OBJECT elements allow this, for example. If there is, this HTML is shown. Otherwise, a simple placeholder will be displayed. Note that there are various parser limitations in the depth of alternative HTML that is allowed; for example, partial form construction through such HTML is unlikely to succeed. No pages have yet been found where this causes a problem, and for the time being, there are no plans to address this minor deficiency. Note: * From here on, we will assume that Browse did *not* handle the EMBED or OBJECT element as an in-line image, and that it *did* find a suitable plug-in to deal with the data. Depending on the configuration options, Browse could wait until the redraw engine has to show the placeholder for the plug-in before attempting to launch it (Start When Viewed); or it could try to launch the plug-in immediately (Start As Soon As Possible). In this latter case, the data structures that describe where the plug-in should be on the displayed page and how large it is have not yet been created, so Browse opens the plug-in at some arbitrary size and position off the visible area of the page. The generic OBJECT handling code will now either wait (until the redraw engine gets back to it), or call the plug-in queue functions to add the plug-in to the queue of waiting plug-ins. * Plug-in queue code ------------------ Several plug-ins may exist on a page, but you can't start attempting to fire up a hundred and one tasks at once - a 'one at a time' approach makes life very much simpler. Whenever Browse decides to fire off a plug-in, it simply tells the plug-in code to add the item to its queue. The plug-in code does so, and if there is only one item in the queue (the one it just added), it starts the plug-in launch process. 1) The parameters file is written. This is done via. a service function that creates a uniquely named file in Scrap. Browse fills in the mandatory BASEHREF, USERAGENT, UAVERSION and APIVERSION entries. Note that in version 2.05 and earlier of Browse, the BASEHREF value will always be equal to the displayed URL in the URL bar - even if a BASE element in the document specifies otherwise. In version 2.06 or later, the BASE element is looked at, then the display URL is used, and if nothing is being displayed yet the current fetching URL is used instead. Browse also fills in the optional BGCOLOR entry to the background colour of the *entire page* (if the plug-in lies inside a coloured table cell Browse 2.06 and earlier do not notice and still give the page background). If the CLASSID attribute is present, Browse writes a CLASSID entry with a leafname taken from the CLASSID attribute and the value of the CODETYPE attribute. If CODEBASE is specified, Browse will also write a CODEBASE entry. If the CLASSID attribute is *not* present, Browse checks for the DATA attribute and if there writes a DATA entry specifying the DATA and TYPE attribute values. Finally, it writes the more generic entries 'STANDBY' (if STANDBY is specified) and 'HEIGHT', then 'WIDTH' using whatever sizes the browser has thus far decided the plug-in should initially take. Having done, effectively, the header entries, Browse goes on to output any required PARAM entries giving a value and type for each, depending on what is available in the HTML source. 2) Browse constructs itself a browser instance handle for the plug-in. 3) It constructs a message block for the Message_PlugIn_Open; the filetype it uses comes from: * CODETYPE; if unset, * TYPE; if neither is set, * filename extension on CLASSID; if all three unset, * filename extension on DATA. 4) RMA space is claimed and the filename of the parameters file; the address is filled in to the message block. 5) Message_PlugIn_Open is broadcast as a UserMessageRecorded (type 18). Later, two things will happen: 1) A reply 2) A bounce In the mean time, this plug-in stays at the head of the plug-in queue (it isn't removed yet) and any other plug-ins that the browser decides to launch will queue up behind it. First, the case of the message bouncing. If the reference is recognised, the browser sees if it has already attempted to open this plug-in once before. If so, the parameters file is deleted, the RMA space claimed for the filename is freed, and the plug-in is removed from the queue, allowing the next item, if there is one, to be launched. If not, though, a system variable name of Alias$@PlugInType_XXX is created with XXX holding the hex equivalent of the filetype stored in the bounced message. If the variable exists, it is executed via. Wimp_StartTask. If this is successful, Browse gets ready to do another message broadcast. By the time the message bounce arrived and Wimp_StartTask had been called and returned, the page may have reformatted itself etc. so the plug-in may have moved. Browse checks the position and resets the relevant details in the message body. It then broadcasts Message_PlugIn_Open a second time. If this second attempt bounces, then as described above, the plug-in launch is abandoned and any pending next launch started. Note: * From here on in, we'll assume that the Message_PlugIn_Open gets a reply rather than bounces. On reception of Message_PlugIn_Opening from the plug-in handling application, Browse checks the flag saying the plug-in will delete the parameters file itself, and if unset, removes the file. In any case, any RMA space claimed for the filename is released. If the plug-in wants the data resource fetched, Browse will start a fetch for it, initiating the streaming protocol once enough data to determine the type has come in. Note: * Currently, fetching the code resource is not implemented. * Currently, helpers are not supported (the plug-in is assumed to have embedded its window in the parent). The plug-in is removed from the queue allowing any others to be launched, and any pending error message is then reported (NB in Browse 2.05 and earlier, there's a bug where any error would be reported and the item would *not* be removed from the queue, causing odd effects thereafter). * Streams ------- Browse currently supports streams as files only. The plug-in code will open a fetch window for the relevant data. It may be the case that data has already been fetched for this URL - Browse implements a local 'mini cache' for the duration of a browsing session - in which case, the Message_PlugIn_Stream_New is sent to the plug-in straight away. Otherwise, data has to start to fetch before this is sent. The Message_PlugIn_Stream_New is sent at the same point that a Save dialogue box would normally open, were you to be downloading the object manually. Again, in both cases, RMA space is used to hold the filename and a pointer to this placed in the outgoing message block. The message is sent UserMessageRecorded, and if it bounces, any fetch in progress for the data is immediately shut down (Message_PlugIn_Stream_Destroy with a reason code of "error" will be sent out). No attempt to restart the fetch is made. If the message does not bounce, the plug-in should have replied with the same message type. Browse checks that the message reference is known, and if it is, will proceed to check the stream type. Should the stream type not be one it can deal with - i.e. currently, as file only - Message_PlugIn_Stream_Destroy is sent back with an "error" reason code to abort the transfer. Otherwise, Browse flags that the transfer is OK and lets the fetcher continue getting the data. At any time, if the fetch is aborted, the temporary file Browse was writing is removed and Message_PlugIn_Stream_Destroy with a "due to user intervention" reason code is sent to the plug-in. Should the fetch complete successfully, Browse sends the plug-in Message_PlugIn_Stream_As_File. This is where things get a little dicey for a moment. Browse has been, all this time, sending messages pointing to RMA data which it has claimed. There's an issue with the plug-in protocol regarding just when data may be freed, and who may free it. When Browse has completed the fetch, it installs a null event handler. This handler should only be called after all the message passing and related events have died away, and so it gives the plug-in "time" to respond to the Stream_As_File message - though this means it MUST take local copies of any relevant data immediately. Browse sends out Message_PlugIn_Stream_Destroy with a 'success' reason code to warn the plug-in that any RMA pointers it took from messages are now invalid, adds a reference to the file it created to its 'mini cache', and closes the window that was responsible for the fetching. The null handler deregisters itself. * Other issues ------------ Browse also supports the Busy, Status and URL_Access messages (the latter including relative and absolute targetted and non-targetted forms). Browse will send out Message_PlugIn_Reshape if it wants the plug-in to move, and will attempt to satisfy ResizeRequest messages from the plug-in (it assumes that the width and height specified are in OS units). For each plug-in it wants to shut down, Browse will send Message_PlugIn_Close. Browse 2.06 and later will also close down any relevant streams, sending out Message_PlugIn_Stream_Destroy as required. Nothing should be assumed about the order in which these will be received or the length of delay (if any) between them (a plug-in should ignore the stream related messages if it has closed down the instance already as it will not recognise the handles quoted in the message anymore). Browse 2.05 and earlier do not shut down fetches for plug-ins when the plug- ins are closed; this is a bug, and can lead to unpredictable behaviour thereafter (multiple copies of the same data in the 'mini cache', files left behind after browser shutdown, attempts to write to the same file simultaneously and other such nasties). This only happens if a new page is visited or a window is closed when the browser was in the middle of fetching data for any plug-is in the old page. In addition, Browse 2.05 or earlier, when closing a page which still has queued plug-ins on it, may not correctly flush the first queued item out of the queue. This can lead to unexpected launch of plug-ins later on. In Browse 2.06 onwards, this bug is fixed.