Friday, December 3, 2010

C# and the Web: Writing a Web Client Application with Managed Code in the Microsoft .NET Framework

What is a Web Client?

      As you know, accessing data from the Internet can be accomplished in several ways. Browsers are one type of Web client that enable you to send requests to a particular Uniform Resource Identifier (URI) and receive the response in a standard format. Browsers are great Web clients for human-readable content; they display the results in HTML format allowing you to read the text and see the images. Performing more advanced tasks such as extracting a stock price from a Web page can be pretty difficult and techniques such as page scraping aren't trivial to implement. So how can applications connect to other applications via the Web?
      The answer to that is the XMLHttpRequest object, which is another kind of Web client. Using this object you can send and receive information to and from a Web resource in various formats, as shown in the following code:
var httpOb = new ActiveXObject("Microsoft.XMLHTTP");

httpOb.Open("POST","", false);
var Ans = httpOb.GetResponseXML();
      The .NET Framework has a request/response model for accessing data from the Internet. Applications that use the request/response model can request data from the Internet in a protocol-agnostic manner—the application works with instances of a Web client (the WebRequest and WebResponse classes), while the details of the request are carried out by protocol-specific descendant classes. The .NET Framework provides a rich class structure for enabling the request/response model. The base classes for that are WebRequest and WebResponse. In my Web client, their descendants—the HTTPWebRequest and HTTPWebResponse classes—play a major role.

A Managed Web Client is Born

      The .NET Framework provides all the goodies you need to have full Web client functionality in the world of managed code, but it doesn't provide a full-featured client that can be reused in your projects, like the XMLHttpRequest object. I decided to wrap some managed classes and create a new managed class with identical behavior to the unmanaged XMLHttpRequest object. It turned out to be easier than I thought.
      In order to write the managed Web client, it's important to understand the common language runtime (CLR) and how to program against it. In addition, understanding the Unified Class Library (UCL) structure is even more important in my case so that I'm able to select the right class for the right job. I won't bore you with the technical details about the structure of the CLR and the exact implementation of the garbage collection, and I won't go through every class of the base classes and explain its interface. Instead I will explain only those classes that are relevant to my implementation of the Web client.
      Take a look at the .NET Framework structure in Figure 1. It provides a schematic overview of the UCL namespace structure. The base classes actually expose the operating system services. For example, the System.IO library enables services such as file streaming, reading, and writing, and the Net library allows you to handle Web requests, Web responses, sockets, and more. On top of it there are the System.Web services which are used to create Web Forms and Web Services, and the System.WindowsForms services which are used to create Windows®-based applications. My application will make extensive use of the System.Net and System.IO libraries for the XMLHttpRequest class.

Figure 1 The Unified Class Library Structure
Figure 1 The Unified Class Library Structure

      The first task is to create a WebRequest object for a specific network request schema—in my case, HTTP. In order to do that I'll use the System.Net.WebRequest class's CreateDefault method. However, instead of using the returned WebRequest object, I'll typecast it to the HttpWebRequest class, which is actually the HTTP-specific implementation of the WebRequest object.