- Programming helps you understand computers. The computer is only a tool. If you learn how to write simple programs, you will gain more knowledge about how a computer works.
- Writing a few simple programs increases your confidence level. Many people find great personal satisfaction in creating a set of instructions that solve a problem.
- Learning programming lets you find out quickly whether you like programming and whether you have the analytical turn of mind programmers need. Even if you decide that programming is not for you, understanding the process certainly will increase your appreciation of what programmers and computers can do.
- Defining the problem
- Planning the solution
- Coding the program
- Testing the program
- Documenting the program
- Defining the Problem
- Suppose that, as a programmer, you are contacted because your services are needed. You meet with users from the client organization to analyze the problem, or you meet with a systems analyst who outlines the project. Specifically, the task of defining the problem consists of identifying what it is you know (input-given data), and what it is you want to obtain (output-the result). Eventually, you produce a written agreement that, among other things, specifies the kind of input, processing, and output required. This is not a simple process.
- Two common ways of planning the solution to a problem are to draw a flowchart and to write pseudocode, or possibly both. Essentially, a flowchart is a pictorial representation of a step-by-step solution to a problem. It consists of arrows representing the direction the program takes and boxes and other symbols representing actions. It is a map of what your program is going to do and how it is going to do it. The American National Standards Institute (ANSI) has developed a standard set of flowchart symbols. Figure 1 shows the symbols and how they might be used in a simple flowchart of a common everyday act-preparing a letter for mailing
- Pseudocode is an English-like nonstandard language that lets you state your solution with more precision than you can in plain English but with less precision than is required when using a formal programming language. Pseudocode permits you to focus on the program logic without having to be concerned just yet about the precise syntax of a particular programming language. However, pseudocode is not executable on the computer. We will illustrate these later in this chapter, when we focus on language examples.
- Coding the Program
- As the programmer, your next step is to code the program-that is, to express your solution in a programming language. You will translate the logic from the flowchart or pseudocode-or some other tool-to a programming language. As we have already noted, a programming language is a set of rules that provides a way of instructing the computer what operations to perform. There are many programming languages: BASIC, COBOL, Pascal, FORTRAN, and C are some examples. You may find yourself working with one or more of these. We will discuss the different types of languages in detail later in this chapter.
- Although programming languages operate grammatically, somewhat like the English language, they are much more precise. To get your program to work, you have to follow exactly the rules-the syntax-of the language you are using. Of course, using the language correctly is no guarantee that your program will work, any more than speaking grammatically correct English means you know what you are talking about. The point is that correct use of the language is the required first step. Then your coded program must be keyed, probably using a terminal or personal computer, in a form the computer can understand.
- One more note here: Programmers usually use a text editor, which is somewhat like a word processing program, to create a file that contains the program. However, as a beginner, you will probably want to write your program code on paper first.
- Testing the Program
- Some experts insist that a well-designed program can be written correctly the first time. In fact, they assert that there are mathematical ways to prove that a program is correct. However, the imperfections of the world are still with us, so most programmers get used to the idea that their newly written programs probably have a few errors. This is a bit discouraging at first, since programmers tend to be precise, careful, detail-oriented people who take pride in their work. Still, there are many opportunities to introduce mistakes into programs, and you, just as those who have gone before you, will probably find several of them.
- Eventually, after coding the program, you must prepare to test it on the computer. This step involves these phases:
- Desk-checking. This phase, similar to proofreading, is sometimes avoided by the programmer who is looking for a shortcut and is eager to run the program on the computer once it is written. However, with careful desk-checking you may discover several errors and possibly save yourself time in the long run. In desk-checking you simply sit down and mentally trace, or check, the logic of the program to attempt to ensure that it is error-free and workable. Many organizations take this phase a step further with a walkthrough, a process in which a group of programmers-your peers-review your program and offer suggestions in a collegial way.
- Translating. A translator is a program that (1) checks the syntax of your program to make sure the programming language was used correctly, giving you all the syntax-error messages, called diagnostics, and (2) then translates your program into a form the computer can understand. A by-product of the process is that the translator tells you if you have improperly used the programming language in some way. These types of mistakes are called syntax errors. The translator produces descriptive error messages. For instance, if in FORTRAN you mistakenly write N=2 *(I+J))-which has two closing parentheses instead of one-you will get a message that says, "UNMATCHED PARENTHESES." (Different translators may provide different wording for error messages.) Programs are most commonly translated by a compiler. A compiler translates your entire program at one time. The translation involves your original program, called a source module, which is transformed by a compiler into an object module. Prewritten programs from a system library may be added during the link/load phase, which results in a load module. The load module can then be executed by the computer.
- Debugging. A term used extensively in programming, debugging means detecting, locating, and correcting bugs (mistakes), usually by running the program. These bugs are logic errors, such as telling a computer to repeat an operation but not telling it how to stop repeating. In this phase you run the program using test data that you devise. You must plan the test data carefully to make sure you test every part of the program.
- Documenting the Program
- Documenting is an ongoing, necessary process, although, as many programmers are, you may be eager to pursue more exciting computer-centered activities. Documentation is a written detailed description of the programming cycle and specific facts about the program. Typical program documentation materials include the origin and nature of the problem, a brief narrative description of the program, logic tools such as flowcharts and pseudocode, data-record descriptions, program listings, and testing results. Comments in the program itself are also considered an essential part of documentation. Many programmers document as they code. In a broader sense, program documentation can be part of the documentation for an entire system.
- The wise programmer continues to document the program throughout its design, development, and testing. Documentation is needed to supplement human memory and to help organize program planning. Also, documentation is critical to communicate with others who have an interest in the program, especially other programmers who may be part of a programming team. And, since turnover is high in the computer industry, written documentation is needed so that those who come after you can make any necessary modifications in the program or track down any errors that you missed.
- Machine language
- Assembly languages
- High-level languages
- Very high-level languages
- Natural languages
Assembly Languages
Today, assembly languages are considered very low level-that is, they are not as convenient for people to use as more recent languages. At the time they were developed, however, they were considered a great leap forward. To replace the Is and Os used in machine language, assembly languages use mnemonic codes, abbreviations that are easy to remember: A for Add, C for Compare, MP for Multiply, STO for storing information in memory, and so on. Although these codes are not English words, they are still- from the standpoint of human convenience-preferable to numbers (Os and 1s) alone. Furthermore, assembly languages permit the use of names- perhaps RATE or TOTAL-for memory locations instead of actual address numbers. just like machine language, each type of computer has its own assembly language.
The programmer who uses an assembly language requires a translator to convert the assembly language program into machine language. A translator is needed because machine language is the only language the computer can actually execute. The translator is an assembler program, also referred to as an assembler. It takes the programs written in assembly language and turns them into machine language. Programmers need not worry about the translating aspect; they need only write programs in assembly language. The translation is taken care of by the assembler.
Although assembly languages represent a step forward, they still have many disadvantages. A key disadvantage is that assembly language is detailed in the extreme, making assembly programming repetitive, tedious, and error prone. This drawback is apparent in the program in Figure 2. Assembly language may be easier to read than machine language, but it is still tedious.
High-Level Languages
The first widespread use of high-level languages in the early 1960s transformed programming into something quite different from what it had been. Programs were written in an English-like manner, thus making them more convenient to use. As a result, a programmer could accomplish more with less effort, and programs could now direct much more complex tasks.
These so-called third-generation languages spurred the great increase in data processing that characterized the 1960s and 1970s. During that time the number of mainframes in use increased from hundreds to tens of thousands. The impact of third-generation languages on our society has been enormous.
Of course, a translator is needed to translate the symbolic statements of a high-level language into computer-executable machine language; this translator is usually a compiler. There are many compilers for each language and one for each type of computer. Since the machine language generated by one computer's COBOL compiler, for instance, is not the machine language of some other computer, it is necessary to have a COBOL compiler for each type of computer on which COBOL programs are to be run. Keep in mind, however, that even though a given program would be compiled to different machine language versions on different machines, the source program itself-the COBOL version-can be essentially identical on each machine.
Some languages are created to serve a specific purpose, such as controlling industrial robots or creating graphics. Many languages, however, are extraordinarily flexible and are considered to be general-purpose. In the past the majority of programming applications were written in BASIC, FORTRAN, or COBOL-all general-purpose languages. In addition to these three, another popular high-level language is C, which we will discuss later.
Very High-Level Languages
Languages called very high-level languages are often known by their generation number, that is, they are called fourth-generation languages or, more simply, 4GLs.
Definition
Will the real fourth-generation languages please stand up? There is no consensus about what constitutes a fourth-generation language. The 4GLs are essentially shorthand programming languages. An operation that requires hundreds of lines in a third-generation language such as COBOL typically requires only five to ten lines in a 4GL. However, beyond the basic criterion of conciseness, 4GLs are difficult to describe.
Characteristics
Fourth-generation languages share some characteristics. The first is that they make a true break with the prior generation-they are basically non-procedural. A procedural language tells the computer how a task is done: Add this, compare that, do this if something is true, and so forth-a very specific step-by-step process. The first three generations of languages are all procedural. In a nonprocedural language, the concept changes. Here, users define only what they want the computer to do; the user does not provide the details of just how it is to be done. Obviously, it is a lot easier and faster just to say what you want rather than how to get it. This leads us to the issue of productivity, a key characteristic of fourth-generation languages.
Productivity
Folklore has it that fourth-generation languages can improve productivity by a factor of 5 to 50. The folklore is true. Most experts say the average improvement factor is about 10-that is, you can be ten times more productive in a fourth-generation language than in a third-generation language. Consider this request: Produce a report showing the total units sold for each product, by customer, in each month and year, and with a subtotal for each customer. In addition, each new customer must start on a new page. A 4GL request looks something like this:
TABLE FILE SALESEven though some training is required to do even this much, you can see that it is pretty simple. The third-generation language COBOL, however, typically requires over 500 statements to fulfill the same request. If we define productivity as producing equivalent results in less time, then fourth-generation languages clearly increase productivity.
SUM UNITS BY MONTH BY CUSTOMER BY PRODUCT
ON CUSTOMER SUBTOTAL PAGE BREAK
END
Downside
Fourth-generation languages are not all peaches and cream and productivity. The 4GLs are still evolving, and that which is still evolving cannot be fully defined or standardized. What is more, since many 4GLs are easy to use, they attract a large number of new users, who may then overcrowd the computer system. One of the main criticisms i s that th e new languages lack the necessary control and flexibility when it comes to planning how you want the output to look. A common perception of 4GLs is that they do not make efficient use of machine resources; however, the benefits of getting a program finished more quickly can far outweigh the extra costs of running it.
Benefits
Fourth-generation languages are beneficial because
- They are results-oriented; they emphasize what instead of how.
- They improve productivity because programs are easy to write and change.
- They can be used with a minimum of training by both programmers and nonprogrammers.
- They shield users from needing an awareness of hardware and program structure.
It was not long ago that few people believed that 4G Ls would ever be able to replace third-generation languages. These 4GL languages are being used, but in a very limited way.
Query Languages
A variation on fourth-generation languages are query langu ages, which can be used to retrieve information from databases. Data is usually added to databases according to a plan, and planned reports may also be produced. But what about a user who needs an unscheduled report or a report that differs somehow from the standard reports? A user can le arn a query language fairly easily and then be able to input a request and receive the resulting report right on his or her own terminal or personal computer. A standardized query lan guage, which can be used with several different commercial database pr ograms, is Structured Query Language, popularly known as SQL. Other popular qu ery languages are Query-by-Example, known as QBE, and Intellect.
Natural Languages
The word "natural" has become almost as popular in computing circles as it h as in the supermarket. Fifth-generation languages are, as you may guess, even more ill-defined than fourth-generation languages. They are most often called natural languages because of their resemblance to the "natural" spoken English language. And , to the manager new to computers for whom these languages are now aimed, natural means human-like. Instead of being forced to key correct command s and data names in correct order, a mana ger tells the computer what to do by keying in his or her own words.
REPORT THE BASE SALARY, COMMISSIONS AND YEARS OF
SERVICE BROKEN DOWN BY STATE AND CITY FOR SALESCLERKS
IN NEW JERSEY AND MASSACHUSETTS.
- In a work environment, your manager may decree that everyone on your project will use a certain language.
- You may use a certain language, particularly in a business environment, based on the need to interface with other programs; if two programs are to work together, it is easiest if they are written in the same language.
- You may choose a language based on its suitability for the task. For example, a business program that handles large files may be best writt en in the busin ess language COBOL.
- If a program is to be run on different computers, it must be written in a language that is portable-suitable on each type of computer-so that the program need be written only once.
- You may be limited by the availability of the language. Not all languages are available in all installations or on all computers.
- The language may be limited to the expertise of the programmer; t hat is, the program may have to be written in a language the available programmer knows.
- Perhaps the simplest reason, one that applies to many amateur programmers, is that they know the language called BASIC because it came with-or was inexpensively purchased with-their personal computers.
The following sections on individual languages will give you an overview of the
third-generation languages in common use today: FORTRAN (a scientific language),
COBOL (a business language), BASIC (simple language used for education and
business), Pascal (education), Ada (military), and C (general purposed).
This chapter will present programs written in some of these languages.
You will also see output produced by each program. Each program is designed to
find the average of three numbers; the resulting average is shown in the sample
output matching each program. Since all programs perform the same task, you will
see some of the differences and similarities among the languages. We do not
expect you to understand these programs; they are here merely to let you glimpse
each language. Figure 4 presents the flowchart and pseudocode for the task of
averaging numbers. As we discuss each language, we will provide a program for
averaging numbers that follows the logic shown in this figure.
Developed by IBM and introduced in 1954, FORTRAN-for FORmula TRANslator-was the first high-level language. FORTRAN is a scientifically oriented language-in the early days use of the computer was primarily associated with engineering, mathematical, and scientific research tasks.
FORTRAN is noted for its brevity, and this characteristic is part of the reason why it remains popular. This language is very good at serving its primary purpose, which is execution of complex formulas such as those used in economic analysis and engineering. Although in the past it was considered limited in regard to file processing or data processing, its capabilities have been greatly improved.
Not all programs are organized in the same way. Organization varies according to the language used. In many languages (such as COBOL), programs are divided into a series of parts. FORTRAN programs are not composed of different parts (although it is possible to link FORTRAN programs together); a FORTRAN program consists of statements one after the other. Different types of data are identified as the data is used. Descriptions for data records appear in format statements that accompany the READ and WRITE statements. Figure 5 shows a FORTRAN program and a sample output from the program.
COBOL: The Language of Business
In the 1950s FORTRAN had been developed, but there was still no accepted
high-level programming language appropriate for business. The U.S. Department of
Defense in particular was interested in creating such a standardized language,
and so it called together representatives from government and various
industries, including the computer industry. These representatives formed
CODASYL-COnference of DAta SYstem Languages. In 1959 CODASYL introduced
COBOL-for COmmon BusinessOriented Language.
The U.S. government offered
encouragement by insisting that anyone attempting to win government contracts
for computer-related projects had to use COBOL. The American National Standards
Institute first standardized COBOL in 1968 and, in 1974, issued standards for
another version known as ANSI-COBOL. After more than seven controversial years
of industry debate, the standard known as COBOL 85 was approved, making COBOL a
more usable modern-day software tool. The principal benefit of standardization
is that COBOL is relatively machine independent- that is, a program written for
one type of computer can be run with only slight modifications on another type
for which a COBOL compiler has been developed.
COBOL is very good for
processing large files and performing relatively simple business calculations,
such as payroll or interest. A noteworthy feature of COBOL is that it is
English-like-far more so than FORTRAN or BASIC. The variable names are set up in
such a way that, even if you know nothing about programming, you can still
understand what the program does. For example: IF SALES-AMOUNT IS GREATER THAN SALES-QUOTA
COMPUTE COMMISSION = MAX-RATE * SALES-AMOUNT
ELSE
COMPUTE COMMISSION = MIN-RATE * SALES-AMOUNT.
Once you understand programming principles, it is not too difficult to
add COBOL to your repertoire. COBOL can be used for just about any task related
to business programming; indeed, it is especially suited to processing
alphanumeric data such as street addresses, purchased items, and dollar
amounts-the data of business. However, the feature that makes COBOL so
useful-its English-like appearance and easy readability-is also a weakness
because a COBOL program can be incredibly verbose. A programmer seldom knocks
out a quick COBOL program. In fact, there is hardly such a thing as a quick
COBOL program; there are just too many program lines to write, even to
accomplish a simple task. For speed and simplicity, BASIC, FORTRAN, and Pascal
are probably better bets.
As you can see in Figure 6, a COBOL program is
divided into four parts called divisions. The identification division identifies
the program by name and often contains helpful comments as well. The environment
division describes the computer on which the program will be compiled and
executed. It also relates each file of the program to the specific physical
device, such as the tape drive or printer, that will read or write the file. The
data division contains details about the data processed by the program, such as
type of characters (whether numeric or alphanumeric), number of characters, and
placement of decimal points. The procedure division contains the statements that
give the computer specific instructions to carry out the logic of the program.
It has been fashionable for some time to criticize COBOL: It is
old-fashioned, cumbersome, and inelegant. In fact, some companies, devoted to
fast, nimble program development, are converting to the more trendy language C.
But COBOL, with more than 30 years of staying power, is still famous for its
clear code, which is easy to read and debug.
BASIC: For Beginners and Others
BASIC-Beginners' All-purpose Symbolic Instruction Code-is a common language that is easy to learn. Developed at Dartmouth College, BASIC was introduced by John Kemeny and Thomas Kurtz in 1965 and was originally intended for use by students in an academic environment. In the late 1960s it became widely used in interactive time-sharing environments in universities and colleges. The use of BASIC has extended to business and personal computer systems.
The primary feature of BASIC is one that may be of interest to many readers of this book: BASIC is easy to learn, even for a person who has never programmed before. Thus, the language is used often to train students in the classroom. BASIC is also used by non-programming people, such as engineers, who find it useful in problem solving. For many years, BASIC was looked down on by "real programmers," who complained that it had too many limitations and was not suitable for complex tasks. Newer versions, such as Microsoft's QuickBASIC, include substantial improvements. An example of a BASIC program and its output are shown in Figure 7.
Pascal: The Language of Simplicity
Named for Blaise Pascal, the seventeenth-century French mathematician, Pascal was developed as a teaching language by a Swiss computer scientist, Niklaus Wirth, and first became available in 1971. Since that time it has become quite popular, first in Europe and now in the United States, particularly in universities and colleges offering computer science programs.
The foremost feature of Pascal is that it is simpler than other languages -it has fewer features and is less wordy than most. In addition to the pop ularity of Pascal in college computer science departments, the language has also made large inroads in the personal computer market as a sim ple yet sophisticated alternative to BASIC. Over the years new versions have improved on the original capabilities of Pascal. Today, Borland's Turbo Pascal leads the Pascal world because its designers eliminated most of the drawbacks of the original Pascal. Turbo Pascal is used by the business community and is often the choice of nonprofession al programmers who need to write their own programs.
Ada: Named for the Countess
Is any software worth over $25 billion? Not any more, according to Defense Department experts. In 1974 the U.S. Department of Defense had spent that amount on all kinds of software for a hodgepodge of languages for its needs. The answer to this problem turned out to be a new language called Ada-named for Countess Ada Lovelace, "the first programmer" (see Appendix B). Sponsored by the Pentagon, Ada was originally intended to be a standard language for weapons systems, but it has also been used successfully for commercial applications. Introduced in 1980, Ada ha s the support not only of the defense establishment but also of such industry heavyweights as IBM and Intel, and Ada is even available for some personal computers. Although some experts have said Ada is too complex, others say that it is easy to learn and that it will increase productivity. Indeed, some experts believe that it is by far a superior commercial language to such standbys as COBOL and FORTRAN.
Widespread use of Ada is considered unlikely by many experts. Although there are many reasons for this (the military services, for instance, have different levels of enthusiasm for it), probably its size- which may hinder its use on personal computers-and complexity are the greatest barriers. Although the Department of Defense is a market in itself, Ada has not caught on to the extent that Pascal and C have, especially in the business community.
C, C++, Java, and Javascript
A language invented by Dennis Ritchie at Bell Labs in 1972, C produces code that approaches assembly language in efficiency while still offering high-level language features. C was originally designed to write systems software but is now considered a general-purpose language. C contains some of the best features from other lan guages, including Pascal. C compilers are simple and compact. A key attraction is that it is independent of the architecture of any particular machine, a fact that contributes to the portability of C programs. That is, a C program can be run on more than one type of computer after it has been compiled for that machine.
Although C is simple and elegant, it is not simple to learn. It was developed for gifted programmers, and the learning curve may be steep. Straightforward tasks may be solved easily in C, but complex problems require mastery of the language.
An interesting side note is that the availability of C on personal computers has greatly enhanced the value of personal computers for budding software entrepreneurs. A cottage software industry can use the same basic tool-the language C-used by established software companies such as Microsoft and Borland. Today C is has been replaced by its enhanced cousin, C++. C++ in turn is being challenged by web-aware languages like Java and Javascript, that look and act a lot like C++, but add features to support working with networked computers, among other things.