DharsansPSP: June 2010

Thursday, June 17, 2010

What is the Difference Between http and https?

Hypertext Transfer Protocol (http) is a system for transmitting and receiving information across the Internet. Http serves as a request and response procedure that all agents on the Internet follow so that information can be rapidly, easily, and accurately disseminated between servers, which hold information, and clients, who are trying to access it. Http is commonly used to access html pages, but other resources can be utilized as well through http. In many cases, clients may be exchanging confidential information with a server, which needs to be secured in order to prevent unauthorized access. For this reason, https, or secure http, was developed by Netscape corporation to allow authorization and secured transactions.

In many ways, https is identical to http, because it follows the same basic protocols. The http or https client, such as a Web browser, establishes a connection to a server on a standard port. When a server receives a request, it returns a status and a message, which may contain the requested information or indicate an error if part of the process malfunctioned. Both systems use the same Uniform Resource Identifier (URI) scheme, so that resources can be universally identified. Use of https in a URI scheme rather than http indicates that an encrypted connection is desired.

There are some primary differences between http and https, however, beginning with the default port, which is 80 for http and 443 for https. Https works by transmitting normal http interactions through an encrypted system, so that in theory, the information cannot be accessed by any party other than the client and end server. There are two common types of encryption layers: Transport Layer Security (TLS) and Secure Sockets Layer (SSL), both of which encode the data records being exchanged.

When using an https connection, the server responds to the initial connection by offering a list of encryption methods it supports. In response, the client selects a connection method, and the client and server exchange certificates to authenticate their identities. After this is done, both parties exchange the encrypted information after ensuring that both are using the same key, and the connection is closed. In order to host https connections, a server must have a public key certificate, which embeds key information with a verification of the key owner's identity. Most certificates are verified by a third party so that clients are assured that the key is secure.

Https is used in many situations, such as log-in pages for banking, forms, corporate log ons, and other applications in which data needs to be secure. However, if not implemented properly, https is not infallible, and therefore it is extremely important for end users to be wary about accepting questionable certificates and cautious with their personal information while using the Internet.

In Detail

Hypertext Transfer Protocol Secure (HTTPS) is a combination of the Hypertext Transfer Protocol with the SSL/TLS protocol to provide encryption and secure (website security testing) identification of the server. It uses port 443. HTTPS connections are often used for payment transactions on the World Wide Web and for sensitive transactions in corporate information systems. HTTPS should not be confused with Secure HTTP (S-HTTP) specified in RFC 2660.

The main idea of HTTPS is to create a secure channel over an insecure network. This ensures reasonable protection from eavesdroppers and man-in-the-middle attacks, provided that adequate cipher suites are used and that the server certificate is verified and trusted.

The trust inherent in HTTPS is based on major certificate authorities which come pre-installed in browser software (this is equivalent to saying "I trust certificate authority (e.g. VeriSign/Microsoft/etc.) to tell me who I should trust"). Therefore an HTTPS connection to a website can be trusted if and only if all of the following are true:

The user trusts that their browser software correctly implements HTTPS with correctly pre-installed certificate authorities.
The user trusts the certificate authority to vouch only for legitimate websites without misleading names.
The website provides a valid certificate (an invalid certificate shows a warning in most browsers), which means it was signed by a trusted authority.
The certificate correctly identifies the website (e.g. visiting https://example and receiving a certificate for "Example Inc." and not anything else ).
Either the intervening hops on the Internet are trustworthy, or the user trusts the protocol's encryption layer (TLS or SSL) is unbreakable by an eavesdropper.

Browser integration

When connecting to a site with an invalid certificate, older browsers would present the user with a dialog box asking if they wanted to continue. Newer browsers display a warning across the entire window. Newer browsers also prominently display the site's security information in the address bar.

Extended validation certificates turn the address bar green in newer browsers. Most browsers also pop up a warning to the user when visiting a site that contains a mixture of encrypted and unencrypted content.

Technical

Difference from HTTP

As opposed to HTTP URLs which begin with "http://" and use port 80 by default, HTTPS URLs begin with "https://" and use port 443 by default.

HTTP is insecure and is subject to man-in-the-middle and eavesdropping attacks which can let attackers gain access to website accounts and sensitive information. HTTPS is designed to withstand such attacks and is considered secure (with the exception of older deprecated versions of SSL).

Network layers

HTTP operates at the highest layer of the OSI Model, the Application layer; but the security protocol operates at a lower sublayer, encrypting an HTTP message prior to transmission and decrypting a message upon arrival. Strictly speaking, HTTPS is not a separate protocol, but refers to use of ordinary HTTP over an encrypted Secure Sockets Layer (SSL) or Transport Layer Security (TLS) connection.

Server setup

To prepare a web server to accept HTTPS connections, the administrator must create a public key certificate for the web server. This certificate must be signed by a trusted certificate authority for the web browser to accept it. The authority certifies that the certificate holder is indeed the entity it claims to be. Web browsers are generally distributed with the signing certificates of major certificate authorities so that they can verify certificates signed by them.

Acquiring certificates

Authoritatively signed certificates may be free or cost between US$13 and $1,500 per year.

Organizations may also run their own certificate authority, particularly if they are responsible for setting up browsers to access their own sites (for example, sites on a company intranet, or major universities). They can easily add copies of their own signing certificate to the trusted certificates distributed with the browser.

Peer-to-peer certificate authorities also exist.

Use as access control

The system can also be used for client authentication in order to limit access to a web server to authorized users. To do this, the site administrator typically creates a certificate for each user, a certificate that is loaded into his/her browser. Normally, that contains the name and e-mail address of the authorized user and is automatically checked by the server on each reconnect to verify the user's identity, potentially without even entering a password.

In case of compromised private key

A certificate may be revoked before it expires, for example because the secrecy of the private key has been compromised. Newer versions of popular browsers such as Firefox, Opera, and Internet Explorer on Windows Vista implement the Online Certificate Status Protocol (OCSP) to verify that this is not the case. The browser sends the certificate's serial number to the certificate authority or its delegate via OCSP and the authority responds, telling the browser whether or not the certificate is still valid.

SSL comes in two options, simple and mutual.

The mutual flavor is more secure but requires the user to install a personal certificate in their browser in order to authenticate themselves.

Whatever strategy is used (simple or mutual) the level of protection strongly depends on the correctness of the implementation of the web browser and the server software and the actual cryptographic algorithms supported. See list in HTTP_Secure.

SSL doesn't prevent the entire site from being indexed using a web crawler, and the URI of the encrypted resource can be inferred by knowing only the intercepted request/response size. This allows an attacker to have access to the plaintext (the publicly-available static content), and the encrypted text (the encrypted version of the static content), permitting a cryptographic attack.

Because SSL operates below HTTP and has no knowledge of higher-level protocols, SSL servers can only strictly present one certificate for a particular IP/port combination. This means that, in most cases, it is not feasible to use name-based virtual hosting with HTTPS. A solution called Server Name Indication (SNI) exists which sends the hostname to the server before encrypting the connection, although many older browsers don't support this extension. Support for SNI is available since Firefox , Opera , and Internet Explorer on Windows Vista.

If parental controls are enabled on Mac OS X, HTTPS sites must be explicitly allowed using the Always Allow list

From a achitectural point of view:

1- An SSL connection is managed by the first front machine which initiate the SSL connection. If for any reasons (routing, traffic optimization, etc.) this front machine is not the application server and it has to decipher data, solutions have to be found to propagate user authentication informations or certifcate to the application server which needs to know who is going to be connected.

2- For SSL with mutual authentication, the SSL session is managed by the first server which initiates the connection. In situations where encryption has to be propagated along chained servers, session timeOut management becomes extremely tricky to implement.

3- With mutual SSL, security is maximal, but on the client-side, there is no way to properly end the SSL connection and disconnect the user except by waiting for the SSL server session to expire or closing all related client applications.

4- For performance reasons static contents are usually delivered through a non-crypted front server or separate server instance with no SSL, as a consequence these contents are usually not protected.

Wednesday, June 16, 2010

Polymorphism

What is polymorphism?

Polymorphism is a word taken from the Greek, meaning "many forms", or words to that effect.

The purpose of polymorphism as it applies to OOP is to allow one name to be used to specify a general class of actions. Within that general class of actions, the specific action that is applied in any particular situation is determined by the type of data involved.

Polymorphism in ActionScript comes into play when inherited methods are overridden to cause them to behave differently for different types of subclass objects.

Overriding versus overloading methods

If you read much in the area of OOP, you will find the words override and overload used frequently. (This lesson deals with overriding methods and does not deal with overloading.)

Some programming languages such as C++, Java, and C# support a concept known as method or constructor overloading. However, ActionScript 3 does not support method or constructor overloading. Overriding a method is an entirely different thing from overloading a method even for those languages that support overloading.

Modify the behavior of an inherited method

Polymorphism can exist when a subclass modifies or customizes the behavior of a method inherited from its superclass in order to meet the special requirements of objects instantiated from the subclass. This is known as overriding a method and requires the use of the override keyword in ActionScript.

Override methods differently for different subclasses

ActionScript supports the notion of overriding a method inherited from a superclass to cause the named method to behave differently when called on objects of different subclasses, each of which extends the same superclass and overrides the same method name.

Example - compute the area of different geometrical shapes

For example, consider the computation of the area of a geometrical shape in a situation where the type of geometrical shape is not known when the program is compiled. Polymorphism is a tool that can be used to handle this situation.

Circle and Rectangle extend Shape

Assume that classes named Circle and Rectangle each extend a class named Shape . Assume that the

Shape class defines a method named area . Assume further that the area method is properly overridden in the Circle and Rectangle classes to return the correct area for a circle or a rectangle respectively.

Three types of objects

In this case, a Circle object is a Shape object because the Circle class extends the Shape class. Similarly, a Rectangle object is also a Shape object.

Therefore, an object of the Shape class, the Circle class, or the Rectangle class can be instantiated and any one of the three can be saved in a variable of type Shape.

Flip a virtual random coin

Assume that the program flips a virtual coin and, depending on the outcome of the flip, instantiates an object of either the Circle class or the Rectangle class and saves it in a variable of type Shape. Assuming that the coin flip is truly random, the compiler cannot possibly know at compile time which type of object will be stored in the variable at runtime.

Two versions of the area method

Regardless of which type of object is stored in the variable, the object will contain two versions of the method named area. One version is the version that is defined in the Shape class, and this version will be the same regardless of whether the object is a circle or a rectangle.

Also, this version can't return a valid area because a general shape doesn't have a valid area. However, if the area method is defined to return a value, even this version must return a value even if it isn't valid. (Other programming languages get around this problem with something called an abstract class, which isn't allowed in ActionScript 3.)

The other version of the area method will be different for a Circle object and a Rectangle object due simply to the fact the algorithm for computing the area of a circle is different from the algorithm for computing the area of a rectangle.

Call the area method on the object

If the program calls the area method on the object stored in the variable of type Shape at runtime, the correct version of the area method will be selected and executed and will return the correct area for the type of object stored in that variable. This is runtime polymorphism based on method overriding.

A more general description of runtime polymorphism

A reference variable of a superclass type can be used to reference an object instantiated from any subclass of the superclass.

If an overridden method in a subclass object is called using a superclass-type reference variable, the system will determine, at runtime, which version of the method to use based on the true type of the object, and not based on the type of reference variable used to call the method.

The general rule

The type of the reference determines the names of the methods that can be called on the object. The actual type of the object determines which of possibly several methods having the same name will be executed.

Selection at runtime

Therefore, it is possible (at runtime) to select among a family of overridden methods and determine which method to execute based on the type of the subclass object pointed to by the superclass-type reference when the overridden method is called on the superclass-type reference.

Runtime polymorphism

In some situations, it is possible to identify and call an overridden method at runtime that cannot be identified at compile time. In those situations, the identification of the required method cannot be made until the program is actually running. This is often referred to as late binding, dynamic binding, or run-time polymorphism.

Encapsulation

What is abstraction?

Abstraction is the process by which we specify a new data type, often

referred to an abstract data type or ADT.

How does abstraction relate to encapsulation?

Encapsulation is the process of gathering an ADT's data representation and behavior into one encapsulated entity. In other words, encapsulation converts from the abstract to the concrete.

Some analogies

You might think of this as being similar to converting an idea for an invention into a set of blueprints from which it can be built, or converting a set of written specifications for a widget into a set of drawings that can be used by the machine shop to build the widget.

Automotive engineers encapsulated the specifications for the steering mechanism of my car into a set of manufacturing drawings. Then manufacturing personnel used those drawings to produce an object where they exposed the interface (steering wheel) and hide the implementation (levers, bolts, etc.)

In all likelihood, the steering mechanism object contains a number of other more-specialized embedded objects, each of which has state and behavior and also has an interface and an implementation.

The interfaces for those embedded objects aren't exposed to me, but they are exposed to the other parts of the steering mechanism that use them.

Abstraction

Abstraction is the specification of an abstract data type, which includes a specification of the type's data representation and its behavior. In particular,

* What kind of data can be stored in an entity of the new type, and

* What are all the ways that the data can be manipulated?

A new type

For our purposes, an abstract data type is a new type (not intrinsic to the ActionScript language). It is not one of the primitive data types that are built into the programming language (such as Boolean, int, Number, String, and uint).

Already known to the compiler

The distinction in the previous paragraph is very important. The data representation and behavior of the intrinsic or primitive types is already known to the compiler and cannot normally be modified by the programmer.

Not known to the compiler

The representation and behavior of an abstract type is not known to the compiler until it is defined by the programmer and presented to the compiler in an appropriate manner.

Define data representation and behavior in a class

ActionScript programmers define the data representation and the behavior of a new type (present the specification to the compiler) using the keyword class. In other words, the keyword class is used to convert the specification of a new type into something that the compiler can work with; a set of plans as it were. To define a class is to go from the abstract to the concrete.

Create instances of the new type

Once the new type (class) is defined, one or more objects of that type can be brought into being (instantiated, caused to occupy memory).

Objects have state and behavior

Once instantiated, the object is said to have state and behavior. The state of an object is determined by the current values of the data that it contains and the behavior of an object is determined by its methods.

The state and behavior of a GUI Button object

For example, if we think of a GUI Button as an object, it is fairly easy to visualize the object's state and behavior.

A GUI Button can usually manifest any of a number of different states: size, position, depressed image, not depressed image, label, etc. Each of these states is determined by data stored in the instance variables of the Button object at any given point in time. (The combination of one or more instance variables that determine a particular state is often referred to as a property of the object.)

Similarly, it is not too difficult to visualize the behavior of a GUI Button. When you click it with the mouse, some specific action usually occurs.

An ActionScript class named Button

If you dig deeply enough into the ActionScript class library, you will find that there is a class named Button. Each individual Button object in a Flex application is an instance of the ActionScript class named Button.

The state of Button objects

Each Button object has instance variables, which it does not share with other Button objects. The values of the instance variables define the state of the button at any given time. Other Button objects in the same scope can have different values in their instance variables. Hence they can have a different state.

The behavior of a Button object

Each Button object also has certain fundamental behaviors such as responding to a mouse click event or responding to a mouseOver event.

The ActionScript programmer has control over the code that is executed in response to the event. However, the ActionScript programmer has no control over the fact that a Button object will respond to such an event. The fact that a Button will respond to certain event types is an inherent part of the type specification for the Button class and can only be modified by modifying the source code for the Button class.

Encapsulation

If abstraction is the design or specification of a new type, then encapsulation is its definition and implementation.

A programmer defines the data representation and the behavior of an abstract data type into a class, thereby defining its implementation and its interface. That data representation and behavior is then encapsulated in objects that are instantiated from the class.

Expose the interface and hide the implementation

According to good object-oriented programming practice, an encapsulated design usually exposes the interface and hides the implementation. This is accomplished in different ways with different languages.

Just as most of us don't usually need to care about how the steering mechanism of a car is implemented, a user of a class should not need to care about the details of implementation for that class.

The user of the class (the using programmer) should only need to care that it works as advertised. Of course this assumes that the user of the class has access to good documentation describing the interface and the behavior of objects instantiated from the class.

Should be able to change the implementation later

For a properly designed class, the class designer should be able to come back later and change the implementation, perhaps changing the type of data structure used to store data in the object, and the using programs should not be affected by the change.

Class member access control

Object-oriented programming languages usually provide the ability to control access to the members of a class. For example, ActionScript, C++ and Java all use the keywords public, private, and protected

to control access to the individual members of a class. In addition, ActionScript and Java add a fourth level of access control, which is called internal in ActionScript and is called package-private in Java.(See Class property attributes in a companion document on ActionScript Resources.)

Public, private, and protected

To a first approximation, you can probably guess what public and private mean. Public members are accessible by all code that has access to an object of the class. Private members are accessible only by members belonging to the class.

The protected keyword is used to provide inherited classes with special access to the members of their base classes.

A public user interface

In general, the user interface for a class consists of the public methods. (The variables in a class can also be declared public but this is generally considered to be bad programming practice unless they are actually constants.)

For a properly designed class, the class user stores, reads, and modifies values in the object's data by calling the public methods on a specific instance (object) of the class. (This is sometimes referred to as sending a message to the object asking it to change its state).

ActionScript has a special form of method, often called an implicit setter method or an implicit getter method that is specifically used for this purpose.(You will see several implicit setter methods in the program that I will explain later in this lesson.)

Normally, if the class is properly designed and the implementation is hidden, the user cannot modify the values contained in the instance variables of the object without going through the prescribed public methods in the interface.

Not a good design by default

An object-oriented design is not a good design by default. In an attempt to produce good designs, experienced object-oriented programmers generally agree on certain design standards for classes. For example, the data members(instance variables) are usually private unless they are constants. The user interface usually consists only of public methods and includes few if any data members.

Of course, there are exceptions to every rule. One exception to this general rule is that data members that are intended to be used as symbolic constants are made public and defined in such a way that their values cannot be modified.

The methods in the interface should control access to, or provide a pathway to the private instance variables.

Not bound to the implementation

The interface should be generic in that it is not bound to any particular implementation. Hence, the class author should be able to change the implementation without affecting the using programs so long as the interface doesn't change.

In practice, this means that the signatures of the interface methods should not change, and that the interface methods and their arguments should continue to have the same meaning.

OOPs Concepts New

The three main characteristics of an object-oriented program

Object-oriented programs exhibit three main characteristics:

* Encapsulation

* Inheritance

* Polymorphism

We use these three concepts extensively as we attempt to model the real-world problems that we are trying to solve with our object-oriented programs. I will provide brief descriptions of these concepts in the remainder of this lesson and explain them in detail in future lessons.

Encapsulation example

Consider the steering mechanism of a car as a real-world example of encapsulation. During the past eighty years or so, the steering mechanism for the automobile has evolved into an object in the OOP sense.

Only the interface is exposed

In particular, most of us know how to use the steering mechanism of an automobile without having any idea whatsoever how it is implemented. All most of us care about is the interface, which we often refer to as a steering wheel. We know that if we turn the steering wheel clockwise, the car will turn to the right, and if we turn it counterclockwise, the car will turn to the left.

How is it implemented?

Most of us don't know, and don't really care, how the steering mechanism is actually implemented

"under the hood." In fact, there are probably a number of different implementations for various brands and models of automobiles. Regardless of the brand and model, however, the human interface is pretty much the same. Clockwise turns to the right, counterclockwise turns to the left.

As in the steering mechanism for a car, a common approach in OOP is to "hide the implementation"

and "expose the interface" through encapsulation.

Inheritance example

Another important aspect of OOP is inheritance. Let's form an analogy with the teenager who is building a hotrod. That teenager doesn't normally start with a large chunk of steel and carve an engine out of it. Rather, the teenager will usually start with an existing engine and make improvements to it.

In OOP lingo, that teenager extends the existing engine, derives from the existing engine, inherits from the existing engine, or subclasses the existing engine (depending on which author is describing the process).

Just like in "souping up" an engine for a hotrod, a very common practice in OOP is to create new improved objects by extending existing class definitions.

Reuse, don't reinvent

One of the major arguments in favor of OOP is that it provides a formal mechanism that encourages the reuse of existing programming elements. One of the mottos of OOP is "reuse, don't reinvent."

Polymorphism example

A third important aspect of OOP is polymorphism. This is a Greek word meaning something like one name, many forms . This is a little more difficult to explain in non-programming terminology. However, we will stretch our imagination a little and say that polymorphism is somewhat akin to the

automatic transmission in your car. In my Honda, for example, the automatic transmission has four different methods or functions known collectively as Drive(in addition to the functions of Reverse, Park, and Neutral).

Select Drive to go forward

As an operator of the automobile, I simply select Drive (meaning go forward). Depending on various conditions at runtime , the automatic transmission system decides which version of the Drive function to use in every specific situation. The specific version of the function that is used is based on the current conditions (speed, incline, etc.). This is somewhat analogous to what we will refer to in a subsequent tutorial lesson as runtime polymorphism.

In addition to the three explicit characteristics of encapsulation, inheritance, and polymorphism, an object-oriented program also has an implicit characteristic of abstraction.

What is abstraction?

Abstraction is the process by which we specify a new data type, often

referred to an abstract data type or ADT.

How does abstraction relate to encapsulation?

Encapsulation is the process of gathering an ADT's data representation and behavior into one encapsulated entity. In other words, encapsulation converts from the abstract to the concrete.

Some analogies

The interfaces for those embedded objects aren't exposed to me, but they are exposed to the other parts of the steering mechanism that use them.

Abstraction

Abstraction is the specification of an abstract data type, which includes a specification of the type's data representation and its behavior. In particular,

* What kind of data can be stored in an entity of the new type, and

* What are all the ways that the data can be manipulated?

A new type

Already known to the compiler

Not known to the compiler

The representation and behavior of an abstract type is not known to the compiler until it is defined by the programmer and presented to the compiler in an appropriate manner.

Define data representation and behavior in a class

Create instances of the new type

Once the new type (class) is defined, one or more objects of that type can be brought into being (instantiated, caused to occupy memory).

Objects have state and behavior

The state and behavior of a GUI Button object

For example, if we think of a GUI Button as an object, it is fairly easy to visualize the object's state and behavior.

Similarly, it is not too difficult to visualize the behavior of a GUI Button. When you click it with the mouse, some specific action usually occurs.

An ActionScript class named Button

The state of Button objects

The behavior of a Button object

Each Button object also has certain fundamental behaviors such as responding to a mouse click event or responding to a mouseOver event.

Encapsulation

If abstraction is the design or specification of a new type, then encapsulation is its definition and implementation.

Expose the interface and hide the implementation

Just as most of us don't usually need to care about how the steering mechanism of a car is implemented, a user of a class should not need to care about the details of implementation for that class.

Should be able to change the implementation later

Class member access control

Public, private, and protected

The protected keyword is used to provide inherited classes with special access to the members of their base classes.

A public user interface

Not a good design by default

The methods in the interface should control access to, or provide a pathway to the private instance variables.

Not bound to the implementation

In practice, this means that the signatures of the interface methods should not change, and that the interface methods and their arguments should continue to have the same meaning.

Abstraction

Abstraction is the process of hiding the details and exposing only the essential features of a particular concept or object.

The technique of choosing common features of objects and methods is known as abstracting. It also involves with concealing the details and highlighting only the essential features of a particular object or a concept. A Java programmer makes use of abstraction to specify that a couple of functions form the same kind of task and can be merged to perform a single function.

Abstraction along with two other techniques, information hiding and encapsulation are the most significant techniques in software engineering. All of these three functionalities are known to reduce complexities in processing and programming.

Abstraction is the facility to define objects that represent abstract "actors" that can perform work, report on and change their state, and "communicate" with other objects in the system

public class Animal extends LivingThing
{
private Location loc;
private double energyReserves;

boolean isHungry() {
return energyReserves < 2.5;
}
void eat(Food f) {
// Consume food
energyReserves += f.getCalories();
}
void moveTo(Location l) {
// Move to new location
loc = l;
}
}
With the above definition, one could create objects of type Animal and call their methods like this:
public static void main(String[] args)
{
thePig = new Animal();
theCow = new Animal();
if (thePig.isHungry()) {
thePig.eat(tableScraps);
}
if (theCow.isHungry()) {
theCow.eat(grass);
}
theCow.moveTo(theBarn);
}
In the above example, the class Animal is an abstraction used in place of an actual animal, LivingThing is a further abstraction (in this case a generalisation) of Animal.

If a more differentiated hierarchy of animals is required to differentiate, say, those who provide milk from those who provide nothing except meat at the end of their lives, that is an intermediary level of abstraction, probably DairyAnimal (cows, goats) who would eat foods suitable to giving good milk, and Animal (pigs, steers) who would eat foods to give the best meat quality.

We illustrate this process by way of trying to solve the following problem using a computer language called Java.

Problem: Given a rectangle 4.5 ft wide and 7.2 ft high, compute its area.

We know the area of a rectangle is its width times its height. So all we have to do to solve the above problem is to multiply 4.5 by 7.2 and get the the answer. The question is how to express the above solution in Java, so that the computer can perform the computation.

Data Abstraction

The product of 4.5 by 7.2 is expressed in Java as: 4.5 * 7.2. In this expression, the symbol * represents the multiplication operation. 4.5 and 7.2 are called number literals. Using DrJava, we can type in the expresssion 4.5 * 7.2 directly in the interactions window and see the answer.

Now suppose we change the problem to compute the area of a rectangle of width 3.6 and height 9.3. Has the original problem really change at all? To put it in another way, has the essence of the original problem changed? After all, the formula for computing the answer is still the same. All we have to do is to enter 3.6 * 9.3. What is it that has not change (the invariant)? And what is it that has changed (the variant)?
Type Abstraction

The problem has not changed in that it still deals with the same geometric shape, a rectangle, described in terms of the same dimensions, its width and height. What vary are simply the values of the width and the height. The formula to compute the area of a rectangle given its width and height does not change:

width * height

It does not care what the actual specific values of width and height are. What it cares about is that the values of width and height must be such that the multiplication operation makes sense. How do we express the above invariants in Java?

We just want to think of the width and height of a given rectangle as elements of the set of real numbers. In computing, we group values with common characteristics into a set and called it a type. In Java, the type double is the set of real numbers that are implemented inside the computer in some specific way. The details of this internal representation is immaterial for our purpose and thus can be ignored. In addition to the type double, Java provides many more pre-built types such as int to represent the set of integers and char to represent the set of characters. We will examine and use them as their need arises in future examples. As to our problem, we only need to restrict ourselves to the type double.

We can define the width and the height of a rectangle as double in Java as follows.

double width;
double height;

The above two statements are called variable definitions where width and height are said to be variable names. In Java, a variable represents a memory location inside the computer. We define a variable by first declare its type, then follow the type by the name of the variable, and terminate the definition with a semi-colon. This a Java syntax rule. Violating a syntax rule constitutes an error. When we define a variable in this manner, its associated memory content is initialized to a default value specified by the Java language. For variables of type double, the default value is 0.
Finger Exercise:
Use the interactions paneof DrJava to evaluate width and height and verify that their values are set to 0.

Once we have defined the width and height variables, we can solve our problem by writing the expression that computes the area of the associated rectangle in terms of width and height as follows.

width * height

Observe that the two variable definitions together with the expression to compute the area presented in the above directly translate the description of the problem -two real numbers representing the width and the height of a rectangle- and the high-level thinking of what the solution of the problem should be -area is the width times the height. We have just expressed the invariants of the problem and its solution. Now, how do we vary width and height in Java? We use what is called the assignment operation. To assign the value 4.5 to the variable width and the value 7.2 to the variable height, we write the following Java assignment statements.

width = 4.5;
height = 7.2;

The syntax rule for the assignment statement in Java is: first write the name of the variable, then follow it by the equal sign, then follow the equal sign by a Java expression, and terminate it with a semi-colon. The semantic (i.e. meaning) of such an assignment is: evaluate the expression on the right hand side of the equal sign and assign the resulting value into the memory location represented by the variable name on the left hand side of the equal side. It is an error if the type of the expression on the right hand side is not a subset of the type of the variable on the left hand side.

Now if we evaluate width * height again (using the Interactions Window of DrJava), we should get the desired answer. Life is good so far, though there is a little bit of inconvenience here: we have to type the expression width * height each time we are asked to compute the area of a rectangle with a given width and a given height. This may be OK for such a simple formula, but what if the formula is something much more complex, like computing the length of the diagonal of a rectangle? Re-typing the formula each time is quite an error-prone process. Is there a way to have the computer memorize the formula and perform the computation behind the scene so that we do not have to memorize it and rewrite it ourselves? The answer is yes, and it takes a little bit more work to achieve this goal in Java.

What we would like to do is to build the equivalent of a black box that takes in as inputs two real numbers (recall type double) with a button. When we put in two numbers and depress the button, "magically" the black box will compute the product of the two input numbers and spit out the result, which we will interpret as the area of a rectangle whose width and height are given by the two input numbers. This black box is in essence a specialized calculator that can only compute one thing: the area of a rectangle given a width and a height. To build this box in Java, we use a construct called a class, which looks like the following.

class AreaCalc {
double rectangle(double width, double height) {
return width * height;
}
}

What this Java code means is something like: AreaCalc is a blue print of a specialized computing machine that is capable of accepting two input doubles , one labeled width and the other labeled height, computing their product and returning the result. This computation is given a name: rectangle. In Java parlance, it is called a method for the class AreaCalc.

Here is an example of how we use AreaCalc to compute area of a rectanglee of width 4.5 and height 7.2. In the Interactions pane of DrJava, enter the following lines of code.

AreaCalc calc = new AreaCalc();
calc.rectangle(4.5, 7.2)

The first line of code defines calc as a variable of type AreaCalc and assign to it an instance of the class AreaCalc. new is a keyword in Java. It is an example of what is called a class operator. It operates on a class and creates an instance (also called object) of the given class. The second line of code is a call to the object calc to perform the rectangle task where width is assigned the value 4.5 and height is assigned the value 7.2. To get the area of a 5.6 by 8.4 rectangle, we simply use the same calculator calc again:

calc.rectangle(5.6, 8.4);

So instead of solving just one proble -given a rectangle 4.5 ft wide and 7.2 ft high, compute its area- we havebuilt a "machine" that can compute the area of any given rectangle. But what about computing the area of a right triangle with height 5 and base 4? We cannot simply use this calculator. We need another specialized calculator, the kind that can compute the area of a circle.

There are at least two different designs for such a calculator.

* create a new class called AreaCalc2 with one method called rightTriangle with two input parameters
of type double. This corresponds to designing a different area calculator with one button labeled rightTriangle with two input slots.
* add to AreaCalc a method called rightTriangle with two input parameters of type double. This corresponds to designing an area calculator with two buttons: one labeled rectangle with two input slots and the other labeled rightTriangle, also with two input slots.

In either design, it is the responsibility of the calculator user to pick the appropriate calculator or press the appropriate button on the calculator to correctly obtain the area of the given geometric shape. Since the two computations require exactly the same number of input parameters of exactly the same type, the calculator user must be careful not get mixed up. This may not be too much of an inconvenience if there are only two kinds of shape to choose from: rectangle and right triangle. But what if the user has to choose from hundreds of different shapes? or better yet an open-ende number of shapes? How can we, as programmers, buid a calculator that can handle an infinite number of shapes? The answer lies in abstraction. To motivate how conceptualize the problem, let us digress and contemplate the behavior of a child!
Modeling a Person

For the first few years of his life, Peter did not have a clue what birthdays were, let alone his own birth date. He was incapable of responding to your inquiry on his birthday. It was his parents who planned for his elaborate birthday parties months in advance. We can think of Peter then as a rather "dumb" person with very little intelligence and capability. Now Peter is a college student. There is a piece of memory in his brain that stores his birth date: it's September 12, 1985! Peter is now a rather smart person. He can figure out how many more months till his next birthday and e-mail his wish list two months before his birth day. How do we model a "smart" person like Peter? Modeling such a person entails modeling

* a birth date and
* the computation of the number of months till the next birth day given the current month.

A birth date consists of a month, a day and a year. Each of these data can be represented by an integer, which in Java is called a number of type int. As in the computation of the area of a rectangle, the computation of the number of months till the next birth day given the current month can be represented as a method of some class. What we will do in this case that is different from the area calculator is we will lump both the data (i.e. the birth date) and the computation involving the birth date into one class. The grouping of data and computations on the data into one class is called encapsulation. Below is the Java code modeling an intelligent person who knows how to calculate the number of months before his/her next birth day. The line numbers shown are there for easy referencing and are not part of the code.

1 public class Person {
2 /**
3 * All data fields are private in order to prevent code outside of this
4 * class to access them.
5 */
6 private int _bDay; // birth day
7 private int _bMonth; // birth month; for example, 3 means March.
8 private int _bYear; // birth year

9 /**
10 * Constructor: a special code used to initialize the fields of the class.
11 * The only way to instantiate a Person object is to call new on the constructor.
12 * For example: new Person(28, 2, 1945) will create a Person object with
13 * birth date February 28, 1945.
14 */
15 public Person(int day, int month, int year) {
16 _bDay = day;
17 _bMonth = month;
18 _bYear = year;
19 }

20 /**
21 * Uses "modulo" arithmetic to compute the number of months till the next
22 * birth day given the current month.
23 * @param currentMonth an int representing the current month.
24 */
25 public int nMonthTillBD(int currentMonth) {
26 return (_bMonth - currentMonth + 12) % 12;
27 }
28 }

(Download the above code) We now explain what the above Java code means.

* line 1 defines a class called Person. The opening curly brace at the end of the line and the matching closing brace on line 28 delimit the contents of class Person. The key word public is called an access specifier and means all Java code in the system can reference this class.
* lines 2-5 are comments. Everything between /* and */ are ingored by the compiler.
* lines 6-8 define three integer variables. These variables are called fields of the class. The key word private is another access specifier that prevents access by code outside of the class. Only code inside of the class can access them. Each field is followed by a comment delimited by // and the end-of-line. So there two ways to comment code in Java: start with /* and end with */ or start with // and end with the end-of-line.
* lines 9-14 are comments.
* lines 15-19 constitute what is called a constructor. It is used to initialize the fields of the class to some particular values. The name of the constructor should spell exactly like the class name. Here it is public, menaing it can be called by code outside of the class Person via the operator new. For example, new Person(28, 2, 1945) will create an instance of a Person with _bDay = 28, _bMonth = 2 an d_bYear = 1945.
* lines 20-24are comments.
* line 23 is a special format for documenting the parameters of a metod. This format is called the javadoc format. We will learn more about javadoc in another module.
* lines 25-27 constitute the definition of a method in class Person.
* line 26 is the formula for computing the number of months before the next birthday using the remainder operator %. x % y gives the remainder of the integer division between the dividend x and the divisor y.