Visual Basic and the Win32 Internet SDK

James Braum
Microsoft Developer Network Technology Group

September 5, 1996

Click to open or copy the files in the VBNetGet sample application for this technical article.

Abstract

This article examines a variety of Microsoft® Win32® Internet (WinInet) functions and illustrates how to use them in a Visual Basic® application. Declaring and using these WinInet functions in a class module allows you to leverage the power of the Win32 Internet API, facilitating rapid development of slick Internet-aware applications in any language. This article does not provide in-depth coverage of the API functions; for that refer to the Microsoft Win32 Internet Programmer's Reference at http://www.microsoft.com/intdev/sdk/docs/wininet/.

In this article, I develop a sample application (NetGet) that lets you download and view the contents of a Uniform Resource Locator (URL). The application uses a reusable class that is developed along the way. This class encapsulates the calls to the WinInet functions, making it convenient to use in your own projects. (If you are interested in looking at a Visual C++® implementation of the WinInet functions, refer to the article "The Internet API (or How to Get from Here to There)" by Robert Coleridge, elsewhere on this CD.) The raw HTML is put into a text box, and the list of references and images are parsed and loaded into two different list box controls. A third list box control contains the files you can download.

Introduction

Several third-party tools and controls are available for developing Internet-aware applications, including the Microsoft Internet Control Pack, (information available from http://www.microsoft.com/icp/icpmain.htm). It is often more convenient to write straight to an application programming interface (API) to get exactly the functionality you need rather than to plug in a control, because you can pick and choose exactly the set of functions you need to get a particular job done. Programming with the Win32 Internet (WinInet) API gives you just this level of control. Finding the functions you need and placing them in a class module makes sense because you end up with a reusable and robust object. After you have tested the object, you can safely use it in other projects. That's what this article demonstrates.

WinInet provides a wealth of functionality. Functions for working with Gopher, File Transfer Protocol (FTP), and Hypertext Transfer Protocol (HTTP) abstract the complexities of working with these protocols into an easy-to-use API that can be called from almost any language. Knowledge of Transmission Control Protocol/Internet Protocol (TCP/IP), Windows® Sockets, the inner workings of HTTP, and so forth are not required to use this API. Additionally, because this API offers an abstraction from the actual implementation, you can be sure that as protocols evolve you will not have to rewrite your applications; only the WinInet DLL will need upgrading.

Consider this article a starting point. Take the ideas presented and apply them to your solution set. If the sample class contains the functionality you need, simply drop it into your project and go to work.

The WinInet Functions

This article examines four WinInet functions: InternetOpen, InternetOpenUrl, InternetReadFile, and InternetCloseHandle.

Let's dive right in and look at some sample code. The following code reads the contents of a URL into a buffer. It does not take reams of code to get the contents of a URL—you simply open the connection using InternetOpen, call InternetOpenUrl with a URL you would like to read (http://www.microsoft.com/ in this case, but I am sure that you can come up with one on your own), progressively read the file using InternetReadFile, and close both connections with InternetCloseHandle.

Dim hInternetSession       As Long      
Dim hUrlFile               As Long
Dim sReadBuffer            As String * 4096     ' Grab 4k at a time
Dim sBuffer                As String
Dim lNumberOfBytesRead     As Long
Dim bDoLoop                As Boolean
hInternetSession = InternetOpen("My VB App!", _ 
         INTERNET_OPEN_TYPE_PRECONFIG, vbNullString, vbNullString, 0)
hUrlFile = InternetOpenUrl(hInternetSession, _ 
         "http://www.microsoft.com/", vbNullString, 0, INTERNET_FLAG_RELOAD, 0)
bDoLoop = True
While bDoLoop
   sReadBuffer = scBlankStr
      bDoLoop = InternetReadFile(hUrlFile, sReadBuffer, _ 
                  Len(sReadBuffer), lNumberOfBytesRead)
      sBuffer = sBuffer & Left$(sReadBuffer, lNumberOfBytesRead)
If Not Cbool(lNumberOfBytesRead) Then bDoLoop = False
Wend
InternetCloseHandle(hUrlFile)
InternetCloseHandle(hInternetSession)

InternetOpen

This function returns a handle that subsequent WinInet function calls use. The declaration looks like this:

Private Declare Function InternetOpen Lib "wininet.dll" _ 
      Alias "InternetOpenA" (ByVal sAgent As String, _ 
      ByVal lAccessType As Long, ByVal sProxyName As String, _
      ByVal sProxyBypass As String, ByVal lFlags As Long) As Long

The InternetOpen function takes five parameters and returns a handle if successful. Look at the LastDLLError property of the Err object if the call fails for a specific error code. Check the Error Codes section in the Microsoft Win32 Internet Programmer's Reference (http://www.microsoft.com/intdev/sdk/docs/wininet/) for relevant information corresponding to the error codes.

The sAgent parameter specifies the name of the application or entity calling the Internet functions. This is the user agent in the HTTP protocol. This could be a value such as "My Internet Application" or "Microsoft Internet Explorer." Some Web servers look at this to autodetect the type of client you are using. Think of this as the way you identify yourself to the HTTP server.

The lAccessType parameter indicates the type of access needed.

sProxyName is a string specifying the proxy server (or servers) to use if requested. To interrogate the registry for proxy server details, leave this parameter NULL and set the lAccessType to INTERNET_OPEN_TYPE_PROXY.

sProxyBypass is a string containing a list of hosts or Internet Protocol (IP) addresses known locally so they are not passed through to the proxy server. To interrogate the registry for local IP addresses, leave this parameter NULL and use the INTERNET_OPEN_TYPE_PROXY for the lAccessType.

lFlags specifies if you want asynchronous or synchronous behavior from subsequent WinInet function calls derived from this handle. To get synchronous behavior, you must specify 0 for this flag.

InternetOpenUrl

This function returns a handle to the URL if the connection succeeds. This is the function that actually begins to read from the URL.

It is important to note that you can use these functions today with Gopher, FTP, and HTTP protocols; they are essentially all-purpose Internet functions. For opening and reading Hypertext Markup Language (HTML) from a HTTP URL, these functions are more than adequate. However, if you need additional functionality, such as querying for the number of bytes the URL will return, look at the HTTP WinInet functions in the Microsoft Win32 Internet Programmer's Reference.

The declaration looks like this:

Private Declare Function InternetOpenUrl Lib "wininet.dll" _ 
   Alias "InternetOpenUrlA" (ByVal hInternetSession As Long, _ 
   ByVal sUrl As String, ByVal sHeaders As String, _ 
   ByVal lHeadersLength As Long, ByVal lFlags As Long, _ 
   ByVal lContext As Long) As Long

InternetOpenUrl takes six parameters and, if successful, returns a handle. After you have finished with the handle returned from this function, close it with the InternetCloseHandle function.

hInternetSession is the handle obtained from calling InternetOpen.

sUrl is a string with the URL that you wish to read. The function will parse the URL for you, so just specify the complete URL.

sHeaders is a string that contains optional headers you may want to send to a HTTP server. lHeadersLength is the length of the optional headers string (if you use one). The request-header fields in HTTP allow the client to pass additional information about the request, and about the client itself, to the server. For example, you could specify a Transfer-Encoding scheme and request that the data be sent in that particular format.

lFlags is an action flag for the function—it tells the function how to behave. It can be one of these values:

lContext is passed to callback functions along with the returned handle. Just use 0 for this flag because you are using these functions synchronously.

InternetReadFile

Once you have a handle from InternetOpenUrl you can start reading data. You pass this function a buffer to read the data into and a value indicating the length of the buffer. The function returns TRUE if successful, and FALSE otherwise. An out parameter, lNumberOfBytesRead, indicates how much data was read from the call. If the function returns TRUE and the out parameter indicating the number of bytes read is 0, the function has read all the data corresponding to the URL.

Here is the declaration:

Private Declare Function InternetReadFile Lib "wininet.dll" _
   (ByVal hFile As Long, ByVal sBuffer As String, _ 
   ByVal lNumberOfBytesToRead As Long, _ 
   lNumberOfBytesRead As Long) As Integer

hFile is the handle returned from the call to InternetOpenUrl.

sBuffer is a string that serves as the buffer. For reading HTML data, this buffer must be large enough to contain the complete HTML headers. Also, if you are reading directories in HTML format, the buffer must be large enough to hold the entire contents of the directory. A 32K buffer will be large enough in most cases.

lNumberOfBytesToRead specifies the number of bytes you want to read in one call to the function. Make this equal to the length of the buffer, unless you only want to fill a portion of the buffer for each read.

Finally, lNumberOfBytesRead is the number of bytes the function actually read. It is an out parameter, so this variable will be set by the function.

InternetCloseHandle

This function closes handles opened with WinInet functions and frees resources associated with these functions. If there are outstanding operations on handles that are about to be closed, they will be canceled and the data will be lost.

The function returns TRUE if the handle was closed successfully, and FALSE otherwise. This is the declaration:

Private Declare Function InternetCloseHandle Lib "wininet.dll" _ 
      (ByVal hInet As Long) As Integer 

The NetGet Sample Application

Now let's look at the sample Visual Basic® application, which allows us to view the raw HTML from a specified URL and download files. The sample application uses a class module that will encapsulate the WinInet API calls discussed above. You can drop this class into your project and go to work developing Internet-aware applications, or you can copy the declare statements and use the functions in your own classes or modules.

The NetGet application lets you specify a URL, then view the HTML, references, and images associated with this URL. Double-clicking an image will place a copy of the image in the files list box on the right. The files will be saved to the app.path when you click the Save Files button. The application does some simplistic parsing of the HTML to get the references and images. To find the references, it looks for "<A HREF=", and to find images it looks for "<IMG SRC="; it then continues to read until it reaches the end of the tag.

Figure 1. NetGet References/Images/Files View

NetGet consists of one form and one class module. When the application starts it instantiates an instance of the CNetGet class. You can use this one instance of the object to read multiple URLs.

How to Use CNetGet

The CNetGet object exposes the following methods and properties:

Methods Properties
ReadURL GetRawHTML
ParseHTML GetLastError
Init SetStatusWindow
Term SetUserAgent

As you can see, the object exposes a few easy-to-use methods and properties that facilitate quick and easy development of Internet-aware applications. Using the object is easy. All it takes is some code like this:

Dim objNetGet    As New CNetGet
Dim cRefs        As New Collection
Dim vRef         As Variant
objNetGet.Init
If objNetGet.ReadURL("http://www.microsoft.com/") Then
If objNetGet.ParseHTML("A HREF=""", CRefs, " "" > ? # ") Then   
For Each vRef In cRefs
   debug.print vRef
Next vRef
End If
End If
objNetGet.Term
Set objNetGet = Nothing 

The GetRawHTML property exposes the interface to the HTML. Clicking the HTML tab gives you a view of the HTML:

Figure 2. NetGet raw HTML view

Implementation of the CNetGet Class Module

Now that you have seen the interface to the object, let's look at the implementation.

The general declarations section of the class module contains the WinInet declarations as well as the following private constants and variables:

Private Const INTERNET_OPEN_TYPE_PRECONFIG = 0
Private Const INTERNET_FLAG_RELOAD = &H80000000
Private hInternetSession    As Long
Private hUrlFile            As Long
Private sContents           As String
Private sLastError          As String
Private sStatus             As String
Private objWindow           As Object

The flags and the user agent constant have already been discussed. The other two private variables are for the handles returned from the calls to InternetOpen and InternetOpenUrl. objWindow is the window that you want the object to update. sContents contains the HTML read from the URL.

The ReadURL method calls the functions necessary to open, read data from, and close an Internet connection.

Public Function ReadUrl(ByVal sUrl As String, Optional vFileName As _
                        Variant) As Boolean
Dim sReadBuffer         As String * 2048 ' Bytes to read from one call
Dim lNumberOfBytesRead  As Long       ' Bytes read from call to InternetReadFile
Dim lTotalBytesRead     As Long       ' Total bytes read
Dim bDoLoop             As Boolean    ' Return value from InternetReadFile
Dim bReadInternetFile   As Boolean
Dim bWriteToFile        As Boolean
On Error GoTo errReadUrl
Screen.MousePointer = vbHourglass
SetStatus "Opening Url..."
If Not IsMissing(vFileName) Then
    Dim iFileNum As Integer
    iFileNum = FreeFile
    Open CStr(vFileName) For Binary As iFileNum
    bWriteToFile = True
End If
hUrlFile = InternetOpenUrl(hInternetSession, sUrl, vbNullString, 0,_
                           INTERNET_FLAG_RELOAD, 0)
If CBool(hUrlFile) Then
    sContents = scBlankStr
    bDoLoop = True
    While bDoLoop
        sReadBuffer = scBlankStr
        bDoLoop = InternetReadFile(hUrlFile, sReadBuffer, Len(sReadBuffer), _
                           lNumberOfBytesRead)
        If Not CBool(bDoLoop) Then CheckError
        lTotalBytesRead = lTotalBytesRead + lNumberOfBytesRead
        SetStatus "Reading Url: " & CStr(lTotalBytesRead) & " Bytes read..."
        If CBool(lNumberOfBytesRead) Then
            If bWriteToFile Then
                Put #iFileNum, , sReadBuffer
            Else
                sContents = sContents & Left$(sReadBuffer,lNumberOfBytesRead)
            End If
        Else
            bDoLoop = False
            bReadInternetFile = True
        End If
    Wend
    InternetCloseHandle (hUrlFile)
    ReadUrl = True
Else
    CheckError
End If
If bWriteToFile Then Close
SetStatus "Ready"
Screen.MousePointer = vbDefault
Exit Function
errReadUrl:
sLastError = Error$(Err)
Screen.MousePointer = vbDefault
Exit Function
End Function

This function illustrates the order in which you call the WinInet functions.

If the ReadURL function returns FALSE, check the GetLastError property of the object for a meaningful error message.

ParseHTML takes a token to search for, such as <A HREF="", a collection, and an optional string that contains a list of delimiters. When the token is found (<A HREF="" in this case) the method scans until it reaches one of the delimiter characters in the delimiter string. Here is a look at a portion of the method:

    lStartPos = 1
    lPosInStr = InStr(lStartPos, sContents, sToken, 1)
    While CBool(lPosInStr)
        iCounter = 0
        lPosInStr = lPosInStr + Len(sToken)
        Do
            iRetVal = InStr(1, sUseDelimiter, Mid$(sContents, lPosInStr + _
                  iCounter, 1), 1)
            iCounter = iCounter + 1
        Loop While Not CBool(iRetVal)
        sAddItem = StripChars(Mid$(sContents, lPosInStr, iCounter - 1))
        If Len(sAddItem) Then colItems.Add sAddItem, sAddItem
        lStartPos = lPosInStr + iCounter
        lPosInStr = InStr(lStartPos, sContents, sToken, 1)
    Wend 

The StripChars function removes linefeeds and other unnecessary characters from the strings  recognized by the routine.

This should give you an idea of what you can do once you have read an HTML page. I can think of a number of applications where you would want to read a HTML page and parse the contents. For example, with the references parsed from each page you could construct a Web-traversing robot that could go out and recursively search for pages that contain some particular text.

Conclusion

It is amazingly simple to develop Internet-aware applications using the WinInet functions. Keep in mind that the functions presented in this article are quite generic. The InternetReadUrl function, for instance, is a wrapper to other WinInet functions. If you need to work with the specifics of a particular protocol, such as HTTP or FTP, you should use the more appropriate HTTP or FTP WinInet functions. For reading files from the Internet, however, these functions are very well suited to the task.