memonic

Deploying localized IE add-ons

Save

Deploying localized IE add-ons

IE add-ons are localizable in the same way as any Windows Forms application. But pay attention to the deployment of localized resources – see http://msdn.microsoft.com/en-us/library/y99d1cd3(VS.80).aspx.

Building Browser Helper Objects with Visual Studio 2005

Save

Building Browser Helper Objects with Visual Studio 2005

Tony Schreiner, John Sudds
Microsoft Corporation

October 27, 2006

Summary: This article demonstrates how to use Microsoft Visual Studio 2005 to create a simple Browser Helper Object (BHO), a Component Object Model (COM) object that implements the IObjectWithSite interface and attaches itself to Internet Explorer. This article describes how to create an entry-level BHO step-by-step. At first, the BHO displays a message that reads "Hello World!" as Internet Explorer loads a document. Then, the BHO is extended to remove images from the loaded page. This article is written for developers who want to learn how to extend the functionality of the browser and to create Web developer tools for Internet Explorer. (8 printed pages)

Contents

Introduction
Overview
Setting up the Project
Implementing the Basics
Responding to Events
Manipulating the DOM
Summary
Related Topics

Introduction

This article relies on Microsoft Visual Studio 2005 and Active Template Library (ATL) to develop a BHO using C++. We decided to use ATL because it conveniently implements a basic boilerplate that we can extend for our needs. There are other ways to create a BHO, such as using Microsoft Foundation Classes (MFC) or the Win32 API and COM, but ATL is a lightweight library that automatically handles a lot of the details for us, including setting up the registry with the BHO class identifier (CLSID).

Another strength of ATL is its COM-aware smart pointer classes (such as CComPtr and CComBSTR) that manage the lifetime of COM objects. For example, CComPtr calls AddRef as a value is assigned, and calls Release as the object is destroyed or goes out of scope. Smart pointers simplify the code and help eliminate memory leaks. Their stability and reliability are especially useful when used within the scope of a single method.

The first part of this article walks you through the process of implementing a simple BHO and verifying that it is loaded by Internet Explorer. The next part demonstrates how to connect the BHO to browser events, and the final part shows a simple interaction with the DHTML Document Object Model (DOM) that changes the appearance of a Web page.

Overview

What exactly is a Browser Helper Object (BHO)? In a nutshell, a BHO is a lightweight DLL extension that adds custom functionality to Internet Explorer. Although it is less common and not the focus of this article, BHOs can also add functionality to the Windows Explorer shell.

BHOs typically do not provide any user interface (UI) of their own. Rather, they function in the background by responding to browser events and user input. For example, BHOs can block pop-ups, auto-fill forms, or add support for mouse gestures. It is a common misconception that BHOs are required by toolbar extensions; however, BHOs used in conjunction with toolbars can provide an even richer user experience.

Note  BHOs are convenient tools for end users and developers alike; however, because BHOs are granted considerable power over the browser and Web content, and because they often go undetected, users should take great care to obtain and install BHOs from reliable sources.

The lifetime of a BHO is the same as the lifetime of the browser instance that it interacts with. In Internet Explorer 6 and earlier, this means that a new BHO is created (and destroyed) for each new top-level window. On the other hand, Internet Explorer 7 creates and destroys a new BHO for each tab. BHOs are not loaded by other applications that host the WebBrowser control or by windows such as HTML dialog boxes.

The primary requirement of a BHO is to implement the IObjectWithSite interface. This interface exposes a method, SetSite, that facilitates the initial communication with Internet Explorer and notifies the BHO when it is about to be released. We create a simple browser extension by implementing this interface, and then adding the CLSID of the BHO into the registry.

Let's get started.

Setting up the Project

To create a BHO project with Microsoft Visual Studio 2005:

  1. On the File menu, click New Project....
    The New Project dialog box appears. This dialog box lists the application types that Visual Studio can create.
  2. Under the Visual C++ node, select "ATL" if it is not already selected, then select "ATL Project" from the Visual C++ project types. Name the project "HelloWorld" and use the default location. Click OK.
  3. In the ATL Project Wizard, ensure that the server type is "Dynamic-link library (DLL)" and click Finish.

At this point, Visual Studio has created boilerplate for a DLL. We now add the COM object that implements the BHO.

  1. In the Solution Explorer panel, right-click on the project and select Class... from the Add submenu.
  2. Select "ATL Simple Object" and click Add. The ATL Simple Object Wizard appears.
  3. In Names of the ATL Simple Object Wizard, type "HelloWorldBHO" as a Short Name.The remaining names are filled in automatically.
  4. In Options of the ATL Simple Object Wizard, select "Apartment" for Threading Model, "No" for Aggregation, "Dual" for Interface, and "IObjectWithSite" for Support. ATL Simple Object Wizard Options
  5. Click Finish.

The following files are created as part of this project.

  • HelloWorldBHO.h – this header file contains the class definition for the BHO.
  • HelloWorldBHO.cpp – this source file is the main file for the project and contains the COM object.
  • HelloWorld.cpp – this source file implements the exports that expose the COM object through the DLL.
  • HelloWorld.idl – this source file can be used to define custom COM interfaces. For this article, we will not change this file.
  • HelloWorld.rgs – this resource file contains the registry keys that are written and removed when the DLL is registered and unregistered.

Implementing the Basics

The ATL Project Wizard provides a default implementation of SetSite. Although the interface contract of IObjectWithSite implies that this method may be called again and again as necessary, Internet Explorer invokes this method exactly twice; once to establish a connection, and again as the browser is exiting. Specifically, the SetSite implementation in our BHO performs the following actions:

  • Stores a reference to the site. During initialization, the browser passes a IUnknown pointer to the top-level WebBrowser Control, and the BHO stores a reference to it in a private member variable.
  • Releases the site pointer currently being held. When Internet Explorer passes NULL, the BHO must release all interface references and disconnect from the browser.

As part of the processing of SetSite, the BHO should perform other initialization and uninitialization as required. For example, you can establish a connection point to the browser in order to receive browser events.

HelloWorldBHO.h

Double-click to open HelloWorldBHO.h from the Visual Studio Solution Explorer.

First, include shlguid.h. This file defines interface identifiers for IWebBrowser2 and the events that are used later in the project.

#include <shlguid.h>     // IID_IWebBrowser2, DIID_DWebBrowserEvents2, etc.

Next, in a public section of the CHelloWorldBHO class, declare SetSite.

STDMETHOD(SetSite)(IUnknown *pUnkSite);

The STDMETHOD macro is an ATL convention that marks the method as virtual and ensures that it has the right calling convention for the public COM interface. It helps to demarcate COM interfaces from other public methods that may exist on the class. The STDMETHODIMP macro is likewise used when implementing the member method.

Finally, in a private section of the class declaration, declare a member variable to store the browser site.

private:
    CComPtr<IWebBrowser2>  m_spWebBrowser;

HelloWorldBHO.cpp

Switch now to HelloWorldBHO.cpp and insert the following code for SetSite.

STDMETHODIMP CHelloWorldBHO::SetSite(IUnknown* pUnkSite)
{
    if (pUnkSite != NULL)
    {
        // Cache the pointer to IWebBrowser2.
        pUnkSite->QueryInterface(IID_IWebBrowser2, (void**)&m_spWebBrowser);
    }
    else
    {
        // Release cached pointers and other resources here.
        m_spWebBrowser.Release();
    }

    // Return the base class implementation
    return IObjectWithSiteImpl<CHelloWorldBHO>::SetSite(pUnkSite);
}

During initialization, the browser passes a reference to its top-level IWebBrowser2 interface, which we cache. During uninitialization, the browser passes NULL. To avoid memory leaks and circular reference counts, it's important to release all pointers and resources at that time. Finally, we call the base class implementation so that it can fulfill the rest of the interface contract.

HelloWorld.cpp

When a DLL is loaded, the system calls the DllMain function with a DLL_PROCESS_ATTACH notification. Because Internet Explorer makes extensive use of multi-threading, frequent DLL_THREAD_ATTACH and DLL_THREAD_DETACH notifications to DllMain can slow the overall performance of the extension and the browser process. Since this BHO does not require thread-level tracking, we can call DisableThreadLibraryCalls during the DLL_PROCESS_ATTACH notification to avoid the overhead of new thread notifications.

In HelloWorld.cpp, code the DllMain function as follows:

extern "C" BOOL WINAPI DllMain(HINSTANCE hInstance, DWORD dwReason, LPVOID lpReserved)
{
    if (dwReason == DLL_PROCESS_ATTACH)
    {
        DisableThreadLibraryCalls(hInstance);
    }
    return _AtlModule.DllMain(dwReason, lpReserved); 
}

Register the BHO

All that remains is to add the CLSID of the BHO to the registry. This entry marks the DLL as a browser helper object and causes Internet Explorer to load the BHO at start-up. Visual Studio can register the CLSID when it builds the project.

Note  On Windows Vista, Visual Studio requires elevated privileges to interact with the registry. Make sure to start the development environment by right-clicking Microsoft Visual Studio 2005 in the Start menu and selecting Run as administrator.

The CLSID for this BHO is found in HelloWorld.idl, in a block of code similar to the following:

    importlib("stdole2.tlb");
    [
        uuid(D2F7E1E3-C9DC-4349-B72C-D5A708D6DD77),
        helpstring("HelloWorldBHO Class")
    ]

Note that this file contains three GUIDs; we need the CLSID for the class, not those of the library or interface ID.

To create a self-registering BHO:

  1. Open HelloWorld.rgs from the Solution Explorer in Visual Studio.
  2. Add the following code to the bottom of the file:
    HKLM {
      NoRemove SOFTWARE {
        NoRemove Microsoft {   
          NoRemove Windows {
            NoRemove CurrentVersion {
              NoRemove Explorer {
                NoRemove 'Browser Helper Objects' {
                  ForceRemove '{D2F7E1E3-C9DC-4349-B72C-D5A708D6DD77}' = s 'HelloWorldBHO' {
                    val 'NoExplorer' = d '1'
                  }
                }
              }
            }
          }
        }
      }
    }
    
  3. Replace the GUID that follows ForceRemove above with the CLSID of the BHO found in HelloWorld.idl.Do not replace the curly braces.
  4. Save the file, and rebuild the solution from the Build menu.Visual Studio registers the object automatically.

The NoRemove keyword indicates that the key should be not be deleted when the BHO is unregistered. Unless you specify this keyword, empty keys will be removed. The ForceRemove keyword indicates that the key and any values and sub-keys that it contains should be deleted. ForceRemove also causes the key to be recreated when the BHO is registered, if the key already exists.

Since this BHO is specifically designed for Internet Explorer, we specify the NoExplorer value to prevent Windows Explorer from loading it. Neither the value nor the type makes any difference—as long as the NoExplorer entry exists, Windows Explorer will not load the BHO.

If you haven't done so already, select Build Solution from the Build menu to build and register the BHO.

Take a Test Drive

For a quick test, set a breakpoint in SetSite and start the debugger by pressing F5. When the Executable for Debug Session dialog box appears, select the "Default Web Browser" and click OK. If Internet Explorer is not your default browser, you can browse for the executable.

Note   On Windows Vista, the Internet Explorer Protected Mode feature launches a separate process and exits, making it a little harder to debug. You can easily turn off Protected Mode for the current session in two ways: launch the browser from a administrative process (such as Visual Studio), or create a local HTML file and specify it as a command line parameter to Internet Explorer.

As the browser starts, it loads the DLL for the BHO. When the breakpoint is hit, note that the pUnkSite parameter is set. Press F5 again to continue loading the home page.

Close the browser to verify that SetSite is called again with NULL.

Responding to Events

Now that you've confirmed that Internet Explorer can load and run the BHO, let's take our example a little further by extending the BHO to react to browser events. In this section, we describe how to use ATL to implement an event handler for DocumentComplete that displays a message box after the page loads.

To be notified of events, the BHO establishes a connection point with the browser; to respond to these events, it implements IDispatch. According to the documentation for DocumentComplete, the event has two parameters: pDisp (a pointer to IDispatch) and pUrl. These parameters are passed to IDispatch::Invoke as part of the event; however, unpacking the event parameters by hand is a non-trivial and error-prone task. Fortunately, ATL provides a default implementation that helps to simplify the event-handling logic.

HelloWorldBHO.h

Start in HelloWorldBHO.h by including exdispid.h, which defines the dispatch IDs for browser events.

#include <exdispid.h> // DISPID_DOCUMENTCOMPLETE, etc.

Next, we indiciate that we want to handle events defined by the DWebBrowserEvents2 interface by adding a class definition for IDispEventImpl, which provides an easy and safe alternative to Invoke for handling events. IDispEventImpl works in conjunction with an event sink map to route events to the appropriate handler function.

class ATL_NO_VTABLE CHelloWorldBHO :
    . . . 
    public IDispEventImpl<1, CHelloWorldBHO, &DIID_DWebBrowserEvents2, &LIBID_SHDocVw, 1, 1>

Next, add ATL macros that route the event to a new OnDocumentComplete event handler method, which takes the same arguments, in the same order, as defined by the DocumentComplete event. Place the following code in a public section of the class.

BEGIN_SINK_MAP(CHelloWorldBHO)
    SINK_ENTRY_EX(1, DIID_DWebBrowserEvents2, DISPID_DOCUMENTCOMPLETE, OnDocumentComplete)
END_SINK_MAP()

    // DWebBrowserEvents2
    void STDMETHODCALLTYPE OnDocumentComplete(IDispatch *pDisp, VARIANT *pvarURL); 

The number supplied to the SINK_ENTRY_EX macro (1) refers to the first parameter of the IDispEventImpl class definition and is used to distinguish between events from different interfaces, if necessary. Also note that you cannot return a value from the event handler; that's OK because Internet Explorer ignores values returned from Invoke anyway.

Finally, add a private member variable to track whether the object has established a connection with the browser.

private:
    BOOL m_fAdvised; 

HelloWorldBHO.cpp

To connect the event handler to the browser through the event map, call DispEventAdvise during the processing of SetSite. Likewise, use DispEventUnadvise to break the connection.

Here is the new implementation of SetSite:

STDMETHODIMP CHelloWorldBHO::SetSite(IUnknown* pUnkSite)
{
    if (pUnkSite != NULL)
    {
        // Cache the pointer to IWebBrowser2.
        HRESULT hr = pUnkSite->QueryInterface(IID_IWebBrowser2, (void **)&m_spWebBrowser);
        if (SUCCEEDED(hr))
        {
            // Register to sink events from DWebBrowserEvents2.
            hr = DispEventAdvise(m_spWebBrowser);
            if (SUCCEEDED(hr))
            {
                m_fAdvised = TRUE;
            }
        }
    }
    else
    {
        // Unregister event sink.
        if (m_fAdvised)
        {
            DispEventUnadvise(m_spWebBrowser);
            m_fAdvised = FALSE;
        }

        // Release cached pointers and other resources here.
        m_spWebBrowser.Release();
    }

    // Call base class implementation.
    return IObjectWithSiteImpl<CHelloWorldBHO>::SetSite(pUnkSite);
}

Finally, add a simple OnDocumentComplete event handler.

void STDMETHODCALLTYPE CHelloWorldBHO::OnDocumentComplete(IDispatch *pDisp, VARIANT *pvarURL)
{
    // Retrieve the top-level window from the site.
    HWND hwnd;
    HRESULT hr = m_spWebBrowser->get_HWND((LONG_PTR*)&hwnd);
    if (SUCCEEDED(hr))
    {
        // Output a message box when page is loaded.
        MessageBox(hwnd, L"Hello World!", L"BHO", MB_OK);
    }
}

Notice that the message box uses the top-level window of the site as its parent window, rather than simply passing NULL in that parameter. In Internet Explorer 6, a NULL parent window does not block the application, meaning that the user can continue to interact with the browser while the message box is waiting for user input. In some situations, this can cause the browser to hang or crash. In the rare case that a BHO needs to display a UI, it should always ensure that the dialog box is application modal by specifying a handle to the parent window.

Another Test Drive

Start up Internet Explorer again by pressing F5. After the document has loaded, the BHO displays its message.

The "Hello World!" Message Box

Continue browsing to observe when and how often the message box appears. Notice that the BHO alert is shown not only when the page is loaded, but also when the page is reloaded by clicking the Back button; however, it does not appear when you click the Refresh button. The message box also appears for every new tab and every document loaded in a frame or iframe.

The event is fired after the page is downloaded and parsed, but before the window.onload event is triggered. In the case of multiple frames, the event is fired multiple times followed by the top-level frame at the end. In the code that follows, we detect the final event of a series by comparing the object passed in the pDisp parameter of the event to the top-level browser that was cached in SetSite.

Manipulating the DOM

The following JavaScript code demonstrates a basic manipulation of the DOM. It hides images on the Web page by setting the display attribute of the image's style object to "none."
function RemoveImages(doc)
{
    var images = doc.images;
    if (images != null)
    {
        for (var i = 0; i < images.length; i++) 
        {
            var img = images.item(i);
            img.style.display = "none";
        }
    }
}

In this final section, we show you how to implement this basic logic in C++.

HelloWorldBHO.h

First, open HelloWorldBHO.h and include mshtml.h. This header file defines the interfaces we need for working with the DOM.

#include <mshtml.h>         // DOM interfaces

Next, define the private member method to contain the C++ implementation of the JavaScript above.

private:
    void RemoveImages(IHTMLDocument2 *pDocument);

HelloWorldBHO.cpp

The OnDocumentComplete event handler now does two new things. First, it compares the cached WebBrowser pointer to the object for which the event is fired; if they are equal, the event is for the top-level window and the document is fully loaded. Second, it retrieves a pointer to the document object and passes it to RemoveImages.

void STDMETHODCALLTYPE CHelloWorldBHO::OnDocumentComplete(IDispatch *pDisp, VARIANT *pvarURL)
{
    HRESULT hr = S_OK;

    // Query for the IWebBrowser2 interface.
    CComQIPtr<IWebBrowser2> spTempWebBrowser = pDisp;

    // Is this event associated with the top-level browser?
    if (spTempWebBrowser && m_spWebBrowser &&
        m_spWebBrowser.IsEqualObject(spTempWebBrowser))
    {
        // Get the current document object from browser...
        CComPtr<IDispatch> spDispDoc;
        hr = m_spWebBrowser->get_Document(&spDispDoc);
        if (SUCCEEDED(hr))
        {
            // ...and query for an HTML document.
            CComQIPtr<IHTMLDocument2> spHTMLDoc = spDispDoc;
            if (spHTMLDoc != NULL)
            {
                // Finally, remove the images.
                RemoveImages(spHTMLDoc);
            }
        }
    }
}

The IDispatch pointer in pDisp contains the IWebBrowser2 interface of the window or frame in which the document has loaded. We store the value in a CComQIPtr class variable, which performs a QueryInterface automatically. Next, to determine if the page is completely loaded, we compare the interface pointer to the one we cached in SetSite for the top-level browser. As a result of this test, we only remove images from documents in the top-level browser frame; documents that do not load into the top-level frame do not pass this test. (For more information, see How To Determine When a Page Is Done Loading in WebBrowser Control and How to get the WebBrowser object model of an HTML frame.)

It takes two steps to retrieve the HTML document object. Because get_Document retrieves a pointer for the active document even if the browser has hosted a document object of another type (such as a Microsoft Word document), we must further query the active document for an IHTMLDocument2 interface to determine if it is indeed an HTML page. The IHTMLDocument2 interface provides access to the contents of the DHTML DOM.

After confirming that an HTML document is loaded, we pass the value to RemoveImages. Note that the argument is passed as a pointer to IHTMLDocument2, not as a CComPtr.

void CHelloWorldBHO::RemoveImages(IHTMLDocument2* pDocument)
{
    CComPtr<IHTMLElementCollection> spImages;

    // Get the collection of images from the DOM.
    HRESULT hr = pDocument->get_images(&spImages);
    if (hr == S_OK && spImages != NULL)
    {
        // Get the number of images in the collection.
        long cImages = 0;
        hr = spImages->get_length(&cImages);
        if (hr == S_OK && cImages > 0)
        {
            for (int i = 0; i < cImages; i++)
            {
                CComVariant svarItemIndex(i);
                CComVariant svarEmpty;
                CComPtr<IDispatch> spdispImage;

                // Get the image out of the collection by index.
                hr = spImages->item(svarItemIndex, svarEmpty, &spdispImage);
                if (hr == S_OK && spdispImage != NULL)
                {
                    // First, query for the generic HTML element interface...
                    CComQIPtr<IHTMLElement> spElement = spdispImage;

                    if (spElement)
                    {
                        // ...then ask for the style interface.
                        CComPtr<IHTMLStyle> spStyle;
                        hr = spElement->get_style(&spStyle);

                        // Set display="none" to hide the image.
                        if (hr == S_OK && spStyle != NULL)
                        {
                            static const CComBSTR sbstrNone(L"none");
                            spStyle->put_display(sbstrNone);
                        }
                    }
                }
            }
        }
    }
}

Interacting with the DOM in C++ is more verbose than JavaScript, but the code flow is essentially the same.

The preceding code iterates over each item in the images collection. In script, it is clear whether the collection element is being accessed by ordinal or by name; however, in C++ you must manually disambiguate these arguments by passing an empty variant. We again rely on an ATL helper class—this time CComVariant—to minimize the amount of code that we have to write.

Final Notes

To facilitate scripting, all objects in the DOM use IDispatch to expose properties and methods that are derived from multiple interfaces. In C++, however, you must explicitly query for the interface that supports the property or method you want to use. For example, an image object supports both the IHTMLElement and IHTMLImgElement interfaces. Therefore, to retrieve a style object for an image, you first have to query for an IHTMLElement interface, which exposes the get_style method.

Also note that COM rules do not guarantee a valid pointer on failure; therefore, you need to check the HRESULT after every COM call. Moreover, for many DOM methods it is not an error to return a NULL value, so you need to be careful to check both the return value and the pointer value. To make the check even safer, always initialize the pointer to NULL beforehand. Adopting a defensive, verbose, and error-tolerant coding style can help to prevent unpredictable bugs later.

Summary

There are various types of BHOs with a wide range of purposes; however, all BHOs share one common feature: a connection to the browser. Because of their ability to tightly integrate with Internet Explorer, BHOs are valued by countless developers who want to extend the functionality of the browser. This article demonstrated how to create a simple BHO that modifies the style attributes of IMG elements in a loaded document. We invite you to extend this entry-level example as you like. You can further explore the possibilities by visiting the following links.

Related Topics


Writing Stable Browser Extensions

Save

Writing Stable Browser Extensions

The stability of Windows Internet Explorer is adversely affected by poorly implemented extensions such as Browser Helper Objects (BHOs), toolbars, or Microsoft ActiveX controls. This article summarizes important guidelines developers should follow when creating Component Object Model (COM) extensions to customize the browser and provides tips and best practices for well-behaved browser extensions that do not cause Internet Explorer to crash or become unresponsive ("hang").

This article contains the following sections:

COM Considerations

Tip 1: Use AddRef and Release correctly.

AddRef and Release control the life cycle of a COM object through reference counting; AddRef informs the COM object that one piece of code is now using the object and Release tells the COM object that one piece of code has finished using the object. While the reference count is greater than zero, the object remains active. When the reference count drops to zero, the object destroys itself to free memory.

Release must be called once and only once for each call to AddRef. If AddRef and Release are not used carefully, COM objects can destroy themselves early and cause a crash (too many calls to Release), or negatively impact performance by occupying memory until the browser closes (too many calls to AddRef). For more information, see Managing Object Lifetimes through Reference Counting.

Tip 2: Implement DllCanUnloadNow correctly and efficiently.

Every DLL that contains COM objects must implement DllCanUnloadNow, which is called by COM to see if the DLL can be unloaded. An incorrect or inefficient implementation of DllCanUnloadNow can cause crashes, hangs, or drops in performance.

DllCanUnloadNow should return S_FALSE immediately if the DLL cannot be unloaded. If the DLL can be unloaded, make sure all active threads created by the DLL have been terminated, any memory allocated by the DLL has been freed, and all user interface hooks (Window, mouse, or keyboard hooks) have been removed. Call CoUninitialize once for each CoInitialize that has been called by the DLL. Lastly, because COM makes no guarantees about when a DLL can be unloaded, the DLL must be fully ready to unload before DllCanUnloadNow returns S_OK. For more information, refer to the DllCanUnloadNow documentation.

Code Safety

Tip 3: Always validate external input to prevent buffer overruns.

One of the most common security and stability errors is the buffer overrun or buffer overflow, which happens when code writes more information to a memory buffer than it was designed to hold. Most buffer overruns are caused by bad input—writing too much data into a memory space so that it overwrites adjacent memory. Although this error commonly results in a program crash, malicious attackers can exploit this vulnerability to take control of the system. Internet Explorer extension developers should take extra precautions to avoid buffer overruns. For more information, see Avoiding Buffer Overruns.

Tip 4: Use the /GS compiler switch to add extra protection against buffer overruns.

Microsoft Visual C++ developers can use the /GS compiler option to detect buffer overruns. This switch introduces a new stack layout that includes a computed security cookie between the buffer and the return address of the function. If this value is overwritten, the application will report the error and quit. For more information, see Compiler Security Checks In Depth.

Tip 5: Check the return values of each function for errors before continuing.

When calling functions, developers should take care to check the validity of parameters they are passing. The parameters should be allocated properly, of the correct type, and within the range of values the function expects. This can prevent function errors before they occur; however, not all errors can be anticipated. If an error would affect the execution of the rest of your routine, you should detect and handle it. Take special care when introducing new error checking code. It is a frequent error to forget to release resources in newly introduced error code paths. See also Exception Handling later in this article.

Threading

Tip 6: Handle your chosen threading model appropriately.

The apartment threading model is recommended for extensions, but whether the extension is apartment-modeled or free-threaded, it should follow the rules appropriate to its chosen model. Internet Explorer is multithreaded and will call a free-threaded extension on multiple threads, so use great care when writing a free-threaded extension. For more information, see Processes, Threads, and Apartments.

Tip 7: Avoid single threaded controls.

Since version 4.0, Internet Explorer requires that all ActiveX controls it hosts to be at least apartment-threaded. The browser has performance and stability problems (hangs) with single threaded objects. See How to troubleshoot ActiveX control crashes in Internet Explorer for more information.

Tip 8: Create a worker thread to perform time-consuming tasks in the background.

Whenever the main thread for the extension has to wait for an extended period, any user input or UI messages are ignored until the wait is over. You can avoid this delay by calling functions on a different thread than the main thread that handles the UI. Worker threads are especially important if the extension calls out of the browser process, or uses file or network IO functions, as these operations can create a noticeable delay for users.

Tip 9: Provide a cancel option for long operations.

For single threaded applications, developers should create a Cancel button for any extended activity (or any activity that has the possibility of becoming extended). Applications with separate UI and worker threads should allow the user to cancel any extended operations on the UI thread.

Tip 10: Use the native Windows synchronization mechanisms.

Microsoft Win32 provides a rich set of synchronization constructs (for example, Critical Sections or Semaphores) to prevent threads from accessing data simultaneously. Developers should not create their own synchronization objects and should not avoid synchronization in order to bolster performance. For more information, see About Synchronization.

Tip 11: Never use TerminateThread.

The Win32 TerminateThread function forcibly terminates threads, which runs the risk of orphaning held locks and leaking the thread's stack allocation, among other things. Unless you know exactly what the thread is doing, you run the risk of putting the system into an inconsistent state. (For more information, see TerminateThread.) Instead, use a "shutdown" synchronization event to signal when worker threads are to finish.

For the same reason, take special care when using ExitThread too. When a thread exits, it should release all locks and any resources it has acquired based on the program logic.

Timeouts

Tip 12: Do not use extremely long timeouts or INFINITE as a timeout value.

Whenever you call any function that specifies a timeout value, resist the temptation to wait forever. An infinite timeout value can cause the extension (and the browser itself) to appear to hang because user input cannot be processed until the function returns. Functions that cause a long wait without timing out should also be avoided.

Tip 13: Use the SMTO_ABORTIFHUNG flag.

Instead of SendMessage, use PostMessage or SendMessageTimeout with the SMTO_ABORTIFHUNG flag.

The SendMessage function does not return until the message being sent has been fully processed. This can cause a long wait if the application is still processing a previous message, or if the application is busy and not processing messages at all. SendMessageTimeout with the SMTO_ABORTIFHUNG flag returns without waiting for the timeout period to elapse if the receiving thread is not responding.

DLLMain

Tip 14: Use DisableThreadLibraryCalls to avoid new thread notifications.

All browser extensions are implemented as DLLs and consequently they export a DllMain function, which is called by Windows when the DLL is attached or detached from the browser process. Because Internet Explorer makes extensive use of multithreading, frequent DLL_THREAD_ATTACH and DLL_THREAD_DETACH notifications to DllMain can slow the overall performance of the extension and the browser process. If your extension does not require thread-level tracking, call DisableThreadLibraryCalls during the DLL_PROCESS_ATTACH notification. For more information, see DllMain.

Tip 15: Create a separate routine for complex initialization.

There are limits to what you can do in a DLL entry point. It is possible to introduce deadlocks in DllMain by calling complex APIs, such as those found in Shell32.dll. This happens because the DllMain function runs while holding a lock for the OS loader, which is acquired every time a DLL is loaded. In principle, any API that can trigger a DLL load behind the scenes should be avoided inside DllMain. (Refer to "DllMain Restrictions" section of Mixed DLL Loading Problem for more information.) If complex initialization is needed, create a global initialization function for the DLL and require applications to call the initialization routine before calling any other routines in the DLL.

Memory Management

Tip 16: Isolate the memory used by the extension in one heap.

Whenever possible, Internet Explorer extensions should use the Win32 HeapCreate function to allocate memory instead of using the default memory allocation provided by the compiler (malloc, for example). Do not use GetProcessHeap in conjunction with HeapAlloc; instead, use HeapCreate to allocate a heap dedicated to the extension's memory needs.

Tip 17: Enable memory protection to help mitigate online attacks.

Internet Explorer 7 on Windows Vista introduced an off-by-default Internet Control Panel option to Enable memory protection to help mitigate online attacks. This option is also referred to as Data Execution Prevention or No-Execute (DEP/NX). The option is enabled by default in Internet Explorer 8 on Windows Server 2008 and Windows Vista with Service Pack 1 (SP1).

DEP/NX helps to frustrate attacks by refusing to run code that is marked non-executable in memory. DEP/NX, combined with Address Space Layout Randomization (ASLR), makes it harder for attackers to exploit memory-related vulnerabilities like buffer overruns. Best of all, the protection applies to both Internet Explorer and the add-ons it loads.

For Internet Explorer 7, DEP/NX was disabled by default for compatibility because some popular add-ons had been built with an older version of Active Template Library (ATL). Before version 7.1 SP1, ATL relied upon dynamically generated code in a way that was not compatible with DEP/NX. Fortunately, new DEP/NX APIs have been added to Windows Server 2008 and recent Windows Service Packs to enable use of DEP/NX while retaining compatibility with older versions of ATL.

If you build Internet Explorer add-ons, you can help ensure users enjoy a smooth upgrade to Internet Explorer 8 by taking the following steps:

  • If your code depends on older versions of ATL, rebuild it with ATL v7.1 SP1 or later (Microsoft Visual Studio 2005 includes ATL 8).
  • Set the /NXCompat linker option to indicate that your extension is compatible with DEP/NX.
  • Test your code with DEP/NX enabled by using Internet Explorer 8 on Windows Vista with SP1, or Internet Explorer 7 after enabling the DEP/NX option. (To enable the DEP/NX option, run Internet Explorer as an administrator, and then set the appropriate checkbox in the Advanced tab of Internet Options.)
  • Use other available compiler options like stack defense (/GS), safe exception handling (/SafeSEH), and ASLR (/DynamicBase).

Exception Handling

Tip 18: Always specify the exceptions you intend to catch.

Internet Explorer extensions that take advantage of Win32 Structured Exception Handling should use GetExceptionCode to check the exception type before executing the handler. It is a mistake to catch access violations (AVs) and stack overflow errors because it hides inconsistent state in the process. Catching AVs does not make the extension more robust–it just transforms a crash into data corruption.

After a __try statement, make sure the __except expression only returns TRUE for a limited number of exceptions that the extension can handle. Never execute an exception handler for all exceptions, and do not use the UnhandledExceptionFilter or SetUnhandledExceptionFilter functions. For more information, see Structured Exception Handling.

Delivery

Tip 19: Provide a well-designed installation experience.

If the Internet Explorer extension is collected into a package for distribution to users, the install program should only install on versions of the operating system that have been tested, and refuse to install on those that are unsupported. Additionally, the install program for the extension should also completely uninstall the extension's files.

Tip 20: Provide symbol (PDB) files for each external release.

If you are uncomfortable with sharing complete private debugging information, you might consider offering public ("stripped") symbols instead.

Tip 21: Sign up for the Windows Error Reporting system.

Windows Error Reporting (WER) is a set of Windows technologies that capture software crash data and support end-user reporting of crash information. Through Winqual services, software and hardware vendors can access reports in order to analyze and respond to these problems. WER technologies are implemented in Windows XP and Windows Server 2003 operating systems. For more information, see Windows Error Reporting: Getting Started.

Add-in Express™ 2010 for Internet Explorer® and Microsoft® .net

Save

Add-in Express™ 2010
for Internet Explorer® and Microsoft® .net


Add-in Express for Internet Explorer and .net at first hand

IDEs

  • Visual Studio 2005, 2008, 2010
  • Visual Basic .NET
  • Visual C# .NET
  • Visual C++ .NET
  • Delphi Prism 2009, 2010, 2011, XE

Applications

  • Internet Explorer 6
  • Internet Explorer 7 (32-, 64 bit)
  • Internet Explorer 8 (32-, 64 bit)

Add-in Express for Internet Explorer is the only all-in-one framework that offers you a simple and quick way to customize Microsoft Internet Explorer with your own browser extensions. It completely supports the Internet Explorer Extensibility API and provides a coherent set of .net components, visual designers and deployment tools that make your IE add-on development and deployment very comfortable.

You write applied code only

Add-in Express is based entirely on the Rapid Application Development approach and empowers you to develop professional extensions for Internet Explorer with a couple of clicks. Add-in Express is written in C#, its programming model and run-time code are based on the Internet Explorer SDK and thus far provide the most effective way to extend the IE GUI, access Internet Explorer objects and handle their events.

Written in C# and optimized for IE 7 multi-threaded and IE 8 "multi-processed" architecture, Add-in Express wraps the Internet Explorer Extensibility API with a strong programming model and an elegant solution design. Now you can simply use IE-specific modules and components instead of delving into numerous technical articles and everlasting tricks searching. Just run the Add-in Express IE solution wizard, create your project, tune the deployment scenario and start your work - do coding. Forget about COM-interfaces, registry, JavaScript, IE threads and processes - merely concentrate on your own applied code!

Screenshots gallery

Key features and benefits

Internet Explorer extensibility

Based on the Internet Explorer Extensibility, Add-in Express provides you with an integrated solution that allows adding your custom menu items, toolbar buttons, context menus, toolbars, side-bars and keyboard shortcuts to the Internet Explorer GUI. Learn more...

Integrated solution in focus

Add-in Express integrates all Internet Explorer Extensibility features in one solution with a strong architecture and a lucid interaction model with Internet Explorer objects. You needn't follow the IE SDK way with separate ActiveX-es, script files and registry keys. Learn more...

Visual designers

Add-in Express provides a component-centric model for programming IE add-ons. You use visual designers and components for customizing the Internet Explorer menu, toolbars or side-bars, and for accessing Internet Explorer objects and their events. In addition, all your add-ons are script-enabled. Learn more...

Version-neutrality

Add-in Express includes a version-neutral Internet Explorer interop assemblies based on the IE 6 type library. With these assemblies you make your add-ons compatible with the most popular IE versions - 6, 7 and 8. One code, one project, one solution, one deployment package for all IE versions. Learn more...

32 and 64 bits in one project

Add-in Express delivers its own 32-bit and 64-bit add-on loaders, automatically includes them in add-on setup projects and registers the loaders for both Internet Explorer versions, 32- and 64-bit. The loaders run your add-ons and isolate them in their own app domains. All this makes your add-ons x86 and x64-compatible, secure and isolated. Learn more...

Deployment experience

Deployment is the premium feature of all Add-in Express projects. Each of your Add-in Express solutions is msi-based and web-enabled. Just build your IE extension and its setup project and publish the resulting msi to your deployment server. Learn more...

Deep integration with MS Office add-ins

Add-in Express adds a special template to MS Office add-in projects based on Add-in Express for Microsoft .NET. It enables you to create Office add-ins and Internet Explorer add-ons with the common code base in the framework of one project. Learn more...

True RAD - you write applied code only

Add-in Express is completely based on the True RAD paradigm and reduces the time you spend on COM interfaces, IE-related registry entries and IE add-on deployment. You write applied code only, Add-in Express implements everything else. Learn more...

Browser Helper Object

Save

Browser Helper Object

Add-on Manager from Windows XP SP2 Internet Explorer

A Browser Helper Object (BHO) is a DLL module designed as a plugin for Microsoft's Internet Explorer web browser to provide added functionality. BHOs were introduced in October 1997 with the release of version 4 of Internet Explorer. Most BHOs are loaded once by each new instance of Internet Explorer. However, in the case of the Windows Explorer, a new instance is launched for each window.

Some modules enable the display of different file formats not ordinarily interpretable by the browser. The Adobe Acrobat plugin that allows Internet Explorer users to read PDF files within their browser is a BHO.

Other modules add toolbars to Internet Explorer, such as the Alexa Toolbar that provides a list of web sites related to the one you are currently browsing, or the Google Toolbar that adds a toolbar with a Google search box to the browser user interface.

Contents

[hide]

[edit] Concerns

The BHO API exposes hooks that allow the BHO to access the Document Object Model (DOM) of the current page and to control navigation. Because BHOs have unrestricted access to the Internet Explorer event model, some forms of malware have also been created as BHOs. For example, the Download.ject malware installs a BHO that would activate upon detecting a secure HTTP connection to a financial institution, record the user's keystrokes (intending to capture passwords) and transmit the information to a website used by Russian computer criminals. Other BHOs such as the MyWay Searchbar track users' browsing patterns and pass the information they record to third parties.

Many BHOs introduce visible changes to a browser's interface, such as installing toolbars in Internet Explorer and the like, but others run without any change to the interface. This renders it easy for malicious coders to conceal the actions of their browser add-on, especially since, after being installed, the BHO seldom requires permission before performing further actions. For instance, variants of the ClSpring trojan use BHOs to install scripts to provide a number of instructions to be performed such as adding and deleting registry values and downloading additional executable files, all completely transparent to the user [1]. The DyFuCA spyware even replaces IE's general error page with an ad page.

In response to the problems associated with BHOs and similar extensions to Internet Explorer, Microsoft debuted an Add-on Manager in Internet Explorer 6 with the release of Service Pack 2 for Windows XP (updating it to IE6 Security Version 1, a.k.a. SP2). This utility displays a list of all installed BHOs, browser extensions and ActiveX controls, and allows the user to enable or disable them at will. There are also free tools (such as BHODemon) that list installed BHOs and allow the user to disable malicious extensions. Spybot S&D has a similar tool built in to allow the user to disable installed BHOs. Many anti-spyware applications also offer the capability to block the download or install of BHOs identified as malicious.

In IE9 Beta BHO's and toolbars are not loaded when a link pinned to the taskbar is accessed.

Browser Helper Objects: The Browser the Way You Want It

Save

Browser Helper Objects: The Browser the Way You Want It

Dino Esposito
Microsoft Corporation

January 1999

April 9, 2004 security update: Please also see Security Considerations: Programming and Reusing the Browser to learn more about addressing browser security issues.

Summary: Describes how to use BHOs to customize your browser. (16 printed pages) Covers:

Introduction
Program Customization
What Are Browser Helper Objects?
The Lifecycle of Helper Objects
The IObjectWithSite Interface
Writing a Browser Helper Object
Detecting Who's Calling
Getting in Touch with WebBrowser
Getting Events from the Browser
Accessing the Document Object
Managing the Code Window
Registration of Helper Objects
Summary

Introduction

There are sometimes circumstances in which you need a more or less specialized version of the browser. Sometimes you work around this by developing a completely custom module built on top of the WebBrowser control, complete with buttons, labels, and whatever else the user interface requires. In this case, you're free to add to that browser any new, nonstandard feature you want. But what you actually have is just a new, nonstandard browser. The WebBrowser control is just the parsing engine of the browser. This means there still remains a number of UI-related tasks for you to do: adding an address bar, toolbar, history, status bar, channels, and favorites, just to name a few. So, to create a custom browser you have to write two types of code: the code that transforms the WebBrowser control into a full-fledged browser like Microsoft® Internet Explorer, and the code that implements the new features you want it to support. Wouldn't it be nice if there was a straightforward way to customize Internet Explorer instead? Browser Helper Objects (BHO) do just that.

Program Customization

Historically speaking, the first way to customize the behavior of a program was through subclassing. By this means, you could change the way a given window in a program processed messages and actually obtain a different behavior. Although considered a brute-force approach, because the victim is largely unaware of what happens, it's been the only choice for a long time.

With the advent of the Microsoft Win32® API, interprocess subclassing was discouraged and made a bit harder to code. If you're brave-hearted, however, pointers have never scared you; above all, if you're used to living in symbiosis with system-wide hooks, you might even find it too simple. But this is not always the case. Despite the cleverness of the programming, the point is that each Win32 process runs in its own address space and breaking the process boundaries is somewhat incorrect. On the other hand, there might be circumstances that require you to do this with the best of intentions. More often, customization might be a specific feature the program itself allows by design.

In the latter case, the programs search for additional modules in well-known and prefixed disk zones, load, initialize, and then leave them free to do the job they have been designed to do. This is exactly what happens with the Internet Explorer browser and its helper objects.

What Are Browser Helper Objects?

From this point of view, Internet Explorer is just like any other Win32-based program with its own memory space to preserve. With Browser Helper Objects you can write components—specifically, in-process Component Object Model (COM) components—that Internet Explorer will load each time it starts up. Such objects run in the same memory context as the browser and can perform any action on the available windows and modules. For example, a BHO could detect the browser's typical events, such as GoBack, GoForward, and DocumentComplete; access the browser's menu and toolbar and make changes; create windows to display additional information on the currently viewed page; and install hooks to monitor messages and actions.

Before going any further with the nitty-gritty details of BHO, there are a couple of points I need to illuminate further. First, the BHO is tied to the browser's main window. In practice, this means a new instance of the object is created as soon as a new browser window is created. Any instance of the BHO lives and dies with the browser's instance. Second, BHOs only exist in Internet Explorer, version 4.0 and later.

If you're running the Microsoft Windows® 98, Windows 2000, Windows 95, or Windows NT® version 4.0 operating system with the Active Desktop™ Shell Update (shell version 4.71), BHOs are supported also by Windows Explorer. This has some implications that I'll talk more about later when making performance considerations and evaluating the impact of BHOs.

In its simplest form, a BHO is a COM in-process server registered under a certain registry's key. Upon startup, Internet Explorer looks up that key and loads all the objects whose CLSID is stored there. The browser initializes the object and asks it for a certain interface. If that interface is found, Internet Explorer uses the methods provided to pass its IUnknown pointer down to the helper object. This process is illustrated in Figure 1.

Figure 1. How Internet Explorer loads and initializes helper objects. The BHO site is the COM interface used to establish a communication.

The browser may find a list of CLSIDs in the registry and create an in-process instance of each. As a result, such objects are loaded in the browser's context and can operate as if they were native components. Due to the COM-based nature of Internet Explorer, however, being loaded inside the process space doesn't help that much. Put another way, it's true that the BHO can do a number of potentially useful things, like subclassing constituent windows or installing thread-local hooks, but it is definitely left out from the browser's core activity. To hook on the browser's events or to automate it, the helper object needs to establish a privileged and COM-based channel of communication. For this reason, the BHO should implement an interface called IObjectWithSite. By means of IObjectWithSite, in fact, Internet Explorer will pass a pointer to its IUnknown interface. The BHO can, in turn, store it and query for more specific interfaces, such as IWebBrowser2, IDispatch, and IConnectionPointContainer.

Another way to look at BHOs is in terms of Internet Explorer shell extensions. As you know, a Windows shell extension is a COM in-process server that Windows Explorer loads when it is about to perform a certain action on a document—for example, displaying its context menu. By writing a COM module that implements a few COM interfaces, you're given a chance to add new items to the context menu and then handle them properly. A shell extension must also be registered in such a way that Windows Explorer can find it. A Browser Helper Object follows the same pattern—the only changes are the interfaces to implement. Slightly different is the trigger that causes a BHO to be loaded. Despite the implementation differences, however, shell extensions and BHOs share a common nature, as the following table demonstrates.

Table 1. How Shell Extensions and Browser Helper Objects Implement Common Features

Feature Shell extension Browser Helper Object
Loaded by Windows Explorer. Internet Explorer (and Windows Explorer for shell version 4.71 and later).
Triggered by User's action on a document of a certain class (that is, right-click). Opening of the browser's window.
Unloaded when A few seconds later the reference count goes to 0. The browser window that caused it to load gets closed.
Implemented as COM in-process DLL. COM in-process DLL
Registration requirements Usual entries for a COM server plus other entries, depending on the type of shell extension and the document type that it will apply to. Usual entries for a COM server plus one entry to qualify it as a BHO.
Interfaces needed Depends on the type of the shell extension. IObjectWithSite.

If you're interested in shell extensions, see the MSDN Library Online or CD documentation for a primer. For deeper coverage, check out my recently published book, Professional Shell Programming for Windows (Wrox Press, 1-861001-84-3).

The Lifecycle of Helper Objects

As I mentioned earlier, BHOs aren't just supported by Internet Explorer. Provided you're running at least shell version 4.71, your BHOs will also be loaded by Windows Explorer—meaning that a unique browser can navigate both the Web and local disks with a similar user experience. The next table provides a product-oriented view of the various shell versions available today. The shell version number depends on the version information stored in shell32.dll.

Table 2. Browser Helper Objects Support for the Various Shell Versions

Shell version Installed products BHOs supported by
4.00 Windows 95 and Windows NT 4.0 with or without Internet Explorer 4.0 or earlier.
Note   The Shell Update isn't installed.
Internet Explorer 4.0
4.71 Windows 95 and Windows NT 4.0 with Internet Explorer 4.0 with the Active Desktop Shell Update release. Both Internet Explorer and Windows Explorer
4.72 Windows 98. Both Internet Explorer and Windows Explorer
5.00 Windows 2000 Both Internet Explorer and Windows Explorer

A Browser Helper Object is loaded when the main window of the browser is about to be displayed and is unloaded when that window is destroyed. If you open more copies of the browser window, more instances of the BHO will be created. The BHO is loaded despite the command line that launches the browser. For example, it gets loaded even if you simply want to see only a specific HTML page or a given folder. In general, the BHO is taken into account when either explorer.exe or iexplore.exe execute. If you set the "Open each folder in its own window" folder setting, the BHO will load each time you open a folder.

Figure 2. With this setting, each time you open a folder, a separate instance of explorer.exe executes and loads the registered BHOs.

Notice, however, that this applies only when you open folders starting from the My Computer icon on the desktop. In this case, the shell calls explorer.exe each time you move to another folder. The same won't occur if you start browsing from a two-paned view. In fact, when you change the folder the shell doesn't launch a new instance of the browser but simply creates another instance of the embedded view object. Curiously, if you change the folder by typing a new name in the Address bar, the browsing always takes place in the same window whether Window Explorer's view is single or two-paned.

Things are far simpler with Internet Explorer. You have multiple copies of it only if you explicitly run iexplore.exe multiple times. When you open new windows from Internet Explorer, each window is duplicated in a new thread without originating a new process, and therefore without reloading BHOs.

Above all, the most interesting feature of BHOs is that they are extremely dynamic. Each time Window Explorer's or Internet Explorer's window is opened, the loader reads the CLSID of the installed helper objects from the registry and deals with them. You can have different BHOs loaded by different copies of the browser if you edited the registry between instances of opening the browser. This means that now you have an excellent alternative to writing a new browser from scratch—you can embed WebBrowser in a Microsoft Visual Basic® or Microsoft Foundation Classes (MFC) frame window. At the same time, you're given a great opportunity to arrange very extensible browsing applications. You can rely on the full power of Internet Explorer and add as many add-ons as you want when it suits your needs.

The IObjectWithSite Interface

From this high-level overview of Browser Helper Objects one concept emerges clearly: A BHO is a dynamic-link library (DLL) capable of attaching itself to any new instance of Internet Explorer and, under certain circumstances, also Windows Explorer. Such a module can get in touch with the browser through the container's site.

In general, a site is an intermediate object placed in the middle of the container and each contained object. Through it, the container manages the content of the contained object and, in return, makes the object's internal functionality available. The site-based relationship between containers and objects involves the implementation of interfaces like IOleClientSite on the container side, and IOleObject on the object side. By calling methods on IOleObject, the container makes the object aware of its host environment.

When the container is Internet Explorer (or the Web-enabled version of Windows Explorer), performance issues reduce this communication pattern to the essential. The object is now required to implement a simpler and lighter interface called IObjectWithSite. It provides just two methods.

Table 3. The IObjectWithSite Interface Definition

Method Description
HRESULT SetSite(

IUnknown* pUnkSite)

Receives the IUnknown pointer of the browser. The typical implementation will simply store such a pointer for further use.
HRESULT GetSite(

REFIID riid,

void** ppvSite)

Retrieves and returns the specified interface from the last site set through SetSite(). The typical implementation will query the previously stored pUnkSite pointer for the specified interface.

The only strict requirement for a BHO is implementing this interface. Notice that you should avoid returning E_NOTIMPL from any of the preceding functions. Either you don't implement the interface or you should be able to code its methods properly.

Writing a Browser Helper Object

A Browser Helper Object is a COM in-process server, so what's better than the Active Template Library (ATL) to build one? Another reason for choosing ATL is that it already provides a default and good enough implementation of the IObjectWithSite interface. Plus, among the predefined types of objects that the ATL COM Wizard natively supports, there's one, the Internet Explorer Object, that is just the type of object a BHO should be. An ATL Internet Explorer Object, in fact, is a simple object—that is, a COM server that supports IUnknown and self-registration—plus IObjectWithSite. If you add such an object to your ATL project, and call the corresponding class CViewSource, you get the following code from the wizard:

class ATL_NO_VTABLE CViewSource : 
   public CComObjectRootEx<CComSingleThreadModel>,
   public CComCoClass<CViewSource, &CLSID_ViewSource>,
   public IObjectWithSiteImpl<CViewSource>,
   public IDispatchImpl<IViewSource, &IID_IViewSource, 
                        &LIBID_HTMLEDITLib>

As you can see, the wizard already makes the class inherit from IObjectWithSiteImpl, which is an ATL template class that provides a basic implementation of IObjectWithSite. (See atlcom.h in the ATL\INCLUDE directory of Microsoft Visual Studio® 98.) Usually there's no need to override the GetSite() member function. Instead, the coded behavior of SetSite() often, if not always, needs customization. ATL, in fact, simply stores the IUnknown pointer to a member variable called m_spUnkSite.

Throughout the remainder of the article I'll discuss a quite complex and rich example of BHO. The object will attach itself to Internet Explorer only, and show a text box with the source code of the page being viewed. This code window will be automatically updated when you change the page and grayed out if the document that Internet Explorer is displaying is not an HTML page. Any change you apply to the raw HTML code is immediately reflected in the browser. This kind of magic is made possible by dynamic HTML (DHTML). Such a code window can be hidden and then recalled through a hot key. When visible, it shares the whole desktop work area with Internet Explorer, resizing properly as shown in Figure3.

Figure 3. The Browser Helper Object in action. It attaches to Internet Explorer and shows the source code of the page being viewed. It also allows you to enter (but not save) changes.

The key point with this example is accessing Internet Explorer's browsing machinery, which is nothing more than an instance of the WebBrowser control. This example can be broken out into five main steps:

  1. Detecting who's loading the object, be it Internet Explorer or Windows Explorer.
  2. Getting the IWebBrowser2 interface that renders the WebBrowser object.
  3. Catching the WebBrowser's specific events.
  4. Accessing the document being viewed, making sure it is an HTML document.
  5. Managing the dialog box window with the HTML source code.

The first step is accomplished in the DllMain() code. SetSite(), instead, is the right place to get the pointer to the WebBrowser object. Let's look at all these steps in a bit more detail. For information on what isn't covered here you can refer to the source code available on the MSDN Online Web site.

Detecting Who's Calling

As mentioned earlier, a BHO can be called either by Internet Explorer or Windows Explorer if you're running at least shell version 4.71. In this case, I'm designing a helper object specifically targeted to work with HTML pages, so it will have nothing to do with Windows Explorer. A DLL that doesn't want to be loaded by a certain caller can simply return False in its DllMain() function once it detects who's calling. The GetModuleFileName() API function returns the name of the caller module if you pass NULL as its first argument. Such a parameter is the handle of the module whose name you want to know. NULL means that you want the name of the calling process.

if (dwReason == DLL_PROCESS_ATTACH)
{
TCHAR pszLoader[MAX_PATH];
GetModuleFileName(NULL, pszLoader, MAX_PATH);
_tcslwr(pszLoader);
if (_tcsstr(pszLoader, _T("explorer.exe"))) 
   return FALSE;
}

Once you know the name of the process, you can quit loading if it is Windows Explorer. Notice that a more selective choice might be dangerous. In fact, other processes could try to load the DLL for legitimate reasons and be rejected. The first victim of this situation is regsvr32.exe, the program used to automatically register the object. If you make a different test, say, only against the Internet Explorer executable:

if (!_tcsstr(pszLoader, _T("iexplore.exe"))) 

you won't be able to register the DLL any longer. In fact, when regsvr32.exe attempts to load the DLL to invoke the DllRegisterServer() function, the call will be rejected.

Get in Touch with WebBrowser

The SetSite() method is where the BHO is initialized and where you would perform all the tasks that happen only once. When you navigate to a URL with Internet Explorer, you should wait for a couple of events to make sure the required document has been completely downloaded and then initialized. Only at this point can you safely access its content through the exposed object model, if any. This means you need to acquire a couple of pointers. The first one is the pointer to IWebBrowser2, the interface that renders the WebBrowser object. The second pointer relates to events. This module must register as an event listener with the browser in order to receive the notification of downloads and document-specific events. By making use of ATL smart pointers:

CComQIPtr<IWebBrowser2, &IID_IWebBrowser2> m_spWebBrowser2;
CComQIPtr<IConnectionPointContainer, 
      &IID_IConnectionPointContainer> m_spCPC;

The source code looks like the following:

HRESULT CViewSource::SetSite(IUnknown *pUnkSite)
{
  // Retrieve and store the IWebBrowser2 pointer 
  m_spWebBrowser2 = pUnkSite; 
  if (m_spWebBrowser2 == NULL)
   return E_INVALIDARG;

  // Retrieve and store the IConnectionPointerContainer pointer 
  m_spCPC = m_spWebBrowser2;
  if (m_spCPC == NULL) 
   return E_POINTER;

  // Retrieve and store the HWND of the browser. Plus install
  // a keyboard hook for further use
  RetrieveBrowserWindow();

  // Connect to the container for receiving event notifications
  return Connect();
}

To get a pointer to the IWebBrowser2 interface, you simply query it. The same occurs for IConnectionPointContainer, the first step for event handling. The code for SetSite() also retrieves the HWND of the browser and installs a keyboard hook on the current thread. The HWND will be used later to move and resize the Internet Explorer window. The hook, instead, serves the purpose of providing a hot key to make the code window appear and disappear at the user's leisure.

Getting Events from the Browser

When you navigate to a new URL, the browser needs to primarily accomplish two things: download the referred document and prepare the host environment for it. In other words, it must initialize and make externally available an object model for it. Depending on the type of document, this means either loading a Microsoft ActiveX® server application registered to handle that document (for example, Microsoft Word for .doc files) or initializing some internal components that analyze the document content and fill the elements of the object model that renders it. This is what happens with HTML pages whose content is made available through the DHTML object model. When the document has been completely downloaded, a DownloadComplete event is fired. This does not necessarily mean that it's safe to manage the document's content through its object model. Instead, a DocumentComplete event indicates that everything has been done and the document is ready. (Notice that DocumentComplete arrives only the first time you access the URL. Subsequently, if you press F5 or click the Refresh button, you'll receive only a DownloadComplete event.)

To intercept the events fired by the browser, the BHO needs to connect to it via an IConnectionPoint interface and pass the IDispatch table of the functions that will handle the various events. The pointer to IConnectionPointContainer obtained previously is used to call the FindConnectionPoint method that returns a pointer to the connection point object for the required outgoing interface: in this case, DIID_DWebBrowserEvents2. The following code shows how the connection takes place:

HRESULT CViewSource::Connect(void)
{
  HRESULT hr;
  CComPtr<IConnectionPoint> spCP;

  // Receives the connection point for WebBrowser events
  hr = m_spCPC->FindConnectionPoint(DIID_DWebBrowserEvents2, &spCP);
  if (FAILED(hr))
   return hr;

  // Pass our event handlers to the container. Each time an event occurs
  // the container will invoke the functions of the IDispatch interface 
  // we implemented.
  hr = spCP->Advise( reinterpret_cast<IDispatch*>(this), &m_dwCookie);
  return hr; 
}

By calling the IConnectionPoint's Advise() method, the BHO lets the browser know that it is interested in receiving notifications about events. Due to the COM event-handling mechanism, all this actually means that the BHO provides the browser with a pointer to its IDispatch interface. The browser will then call back the IDispatch's Invoke() method, passing the ID of the event as the first argument.

HRESULT CViewSource::Invoke(DISPID dispidMember, REFIID riid, 
   LCID lcid, WORD wFlags, DISPPARAMS* pDispParams, 
   VARIANT* pvarResult, EXCEPINFO* pExcepInfo, UINT* puArgErr)
{
  if (dispidMember == DISPID_DOCUMENTCOMPLETE) {
      OnDocumentComplete();
      m_bDocumentCompleted = true;
  }
  :
}

It's important to remember to disconnect from the browser when events are no longer needed. If you forget to do this, the BHO will remain locked even after you close the browser's window. (Among other things, this means you can't recompile or delete the object.) A good time to disconnect is when you receive the OnQuit event.

Accessing the Document Object

At this point the BHO has a reference to Internet Explorer's WebBrowser control and is connected to the browser for receiving all the events it generates. When the Web page is completely downloaded and properly initialized, it's finally possible to access it through the DHTML document object model. The Document property of WebBrowser returns a pointer to the IDispatch interface of the document object:

CComPtr<IDispatch> pDisp;
HRESULT hr = m_spWebBrowser2->get_Document(&pDisp);

What the get_Document() method provides is just a pointer to an interface. We need to make sure that behind that IDispatch pointer there's really an HTML document object. If I were using Visual Basic, the following would have been equivalent code:

Dim doc As Object
Set doc = WebBrowser1.Document
If TypeName(doc)="HTMLDocument" Then
   ' Get the document content and display
Else
   ' Disable the display dialog
End If

What's needed is a way to learn about the nature of the IDispatch pointer returned by get_Document(). Internet Explorer is more than an HTML browser and is capable of hosting any ActiveX document—that is, any document for which an application exists that acts as an ActiveX document server. Given this, there's no guarantee the document viewed is really an HTML page.

One solution is to look at the location URL and check the URL's extension. But what about Active Server Pages (ASP) or a URL where the HTML page is implicit? And what if you're using custom protocols like about or res? (For more information about custom protocols, check out my Cutting Edge column in the January 1999 issue of MIND magazine.)

I decided to take another approach, much more akin to the Visual Basic code just shown. The idea is, if the IDispatch pointer actually refers to an HTML document, querying for the IHTMLDocument2 interface would be successful. IHTMLDocument2 is the interface that wraps up all the functionality that the DHTML object model exposes for an HTML page. The following code snippet shows how to proceed:

CComPtr<IDispatch> pDisp;
HRESULT hr = m_spWebBrowser2->get_Document(&pDisp);
CComQIPtr<IHTMLDocument2, &IID_IHTMLDocument2> spHTML;
spHTML = pDisp;
if (spHTML) {
   // get the content of the document and display it
} 
else {
   // disable the Code Window controls
}

The spHTML pointer is NULL if the query interface for IHTMLDocument2 failed. Otherwise, we're fine with the methods and properties of the DHTML object model.

Now the problem becomes how to get the source code of the displayed page. Fortunately, to work around this a rudimentary knowledge of DHTML will suffice. Just as an HTML page encloses all its content into a <BODY> tag, the DHTML object model requires you to get a pointer to the Body object as the first step:

CComPtr<IHTMLElement> m_pBody;
hr = spHTML->get_body(&m_pBody);

Curiously, the DHTML object model doesn't let you know about the raw content of the tags that precede <BODY>, such as <HEAD>. Their content is processed and then stored in a number of properties, but you still don't have one returning the raw text contained in the original HTML file. What the body can tell, however, will suffice here. To get the HTML code included in the <BODY>…</BODY> tags I need to read the content of the outerHTML property into a BSTR variable:

BSTR bstrHTMLText;
hr = m_pBody->get_outerHTML(&bstrHTMLText);

At this point, displaying the text into the code window is a matter of creating the window, converting the string from Unicode to ANSI, and setting the edit box, as shown in Figure 3. The following shows the full code for this:

HRESULT CViewSource::GetDocumentContent()
{
  USES_CONVERSION;
  
  // Get the WebBrowser's document object
  CComPtr<IDispatch> pDisp;
  HRESULT hr = m_spWebBrowser2->get_Document(&pDisp);
  if (FAILED(hr))
   return hr;

  // Verify that what we get is a pointer to a IHTMLDocument2 
  // interface. To be sure, let's query for 
  // the IHTMLDocument2 interface (through smart pointers)
  CComQIPtr<IHTMLDocument2, &IID_IHTMLDocument2> spHTML;
  spHTML = pDisp;

  // Extract the source code of the document
  if (spHTML)
  {
    // Get the BODY object
    hr = spHTML->get_body(&m_pBody); 
    if (FAILED(hr))
        return hr;

    // Get the HTML text
    BSTR bstrHTMLText;
    hr = m_pBody->get_outerHTML(&bstrHTMLText); 
    if (FAILED(hr))
     return hr;

    // Convert the text from Unicode to ANSI
    LPTSTR psz = new TCHAR[SysStringLen(bstrHTMLText)];
    lstrcpy(psz, OLE2T(bstrHTMLText));
      
    // Enable changes to the text
    HWND hwnd = m_dlgCode.GetDlgItem(IDC_TEXT);
    EnableWindow(hwnd, true);
    hwnd = m_dlgCode.GetDlgItem(IDC_APPLY);
    EnableWindow(hwnd, true);

    // Set the text in the Code Window
    m_dlgCode.SetDlgItemText(IDC_TEXT, psz); 
    delete [] psz;
  }
  else   // The document isn't a HTML page
  {
    m_dlgCode.SetDlgItemText(IDC_TEXT, ""); 
    HWND hwnd = m_dlgCode.GetDlgItem(IDC_TEXT);
    EnableWindow(hwnd, false);
    hwnd = m_dlgCode.GetDlgItem(IDC_APPLY);
    EnableWindow(hwnd, false);
  }

  return S_OK;  
}

Because I run this code in response to the DocumentComplete notification, each new page is automatically and promptly processed. The DHTML object model lets you modify on the fly the structure of the page, but all the changes will be lost as soon as you refresh the view by hitting F5 or clicking the browser's Refresh button. By also handling the DownloadComplete event you can refresh the code window as well. (Pay attention to the fact that the DownloadComplete event comes before DocumentComplete.) So, you should ignore the DownloadComplete generated by the first download of the page and consider it only when it originates from a refresh. A simple Boolean member, for example m_bDocumentCompleted, is of great help in distinguishing between the situations.

Managing the Code Window

The code window used to show the HTML source code of the current page is another ATL basic element—a dialog box window that you find in the Miscellaneous page of the ATL Object Wizard. I resize this window in response to the WM_INITDIALOG message and make it occupy the lowest portion of the desktop work area—that is, the available screen minus the taskbar, wherever it is docked.

This window may or may not appear at the browser startup. By default it does, but this can be prevented by clearing the Show window at startup check box. You can also close the window if you like. By pressing F12, however, you can bring it back at any time. F12 is caught by the keyboard hook I installed in SetSite().

The startup setting is saved to the registry in full accordance with Microsoft guidelines. To read and write the registry I employed the new Shell Lightweight API (shlwapi.dll) instead of the Win32 functions, saving the hassle of opening and closing the involved keys:

DWORD dwType, dwVal;
DWORD dwSize = sizeof(DWORD);
SHGetValue(HKEY_CURRENT_USER, _T("Software\\MSDN\\BHO"), 
   _T("ShowWindowAtStartup"), &dwType, &dwVal, &dwSize);

This DLL has been introduced with Internet Explorer 4.0 and Active Desktop, and is a standard system component beginning with Windows 98. Such functions are more direct than the corresponding Win32 functions and are preferred for single reading and writing.

Registration of Helper Objects

A BHO is a COM server and should be registered both as a COM server and as a BHO. The ATL Wizard provides you with the necessary registrar script code (RGS) that accomplishes the first task. What follows is the RGS code that properly installs a helper object. (The CLSID comes from the example.)

HKLM {
 SOFTWARE {
  Microsoft {   
   Windows {
    CurrentVersion {
     Explorer {
      'Browser Helper Objects' {
       ForceRemove {1E1B2879-88FF-11D2-8D96-D7ACAC95951F}        
}}}}}}}

Note the ForceRemove clause that causes the key to be removed when you unregister the object.

Under the Browser Helper Objects key fall all the installed helper objects. Such a list is never cached by the browser, so installing and testing BHOs is really a quick matter.

Summary

In this article, I presented Browser Helper Objects—a relatively new and powerful way of injecting your code directly inside the browser's address space. What you have to do is write a COM server that supports the IObjectWithSite interface. At this point, your module is for all legal purposes a component of the browser machinery. The sample I've built throughout the article also touched on topics such as COM events, the dynamic HTML object model, and the WebBrowser programming interface, which may appear to be a little off the topic. Instead, I think this demonstrates the power of BHOs, and at the same time provides a real-world platform on which to build your own objects. If you need to know what the browser is displaying, you absolutely need to sink events and become familiar with WebBrowser. Now you know: forewarned is forearmed. To conclude, let me also remind you that BHOs are useful with Windows Explorer as well and, thanks to WebBrowser, they can be driven from your code.