COMPUTER PROGRAMMING: March 2010

Wednesday, March 17, 2010

What is C programming language?

C is a general-purpose computer programming language developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system.

Although C was designed for implementing system software, it is also widely used for developing portable application software.

C is one of the most popular programming languages and there are very few computer architectures for which a C compiler does not exist. C has greatly influenced many other popular programming languages, most notably C++, which originally began as an extension to C.

Design
C is an imperative systems implementation language. It was designed to be compiled using a relatively straightforward compiler, to provide low-level access to memory, to provide language constructs that map efficiently to machine instructions, and to require minimal run-time support. C was therefore useful for many applications that had formerly been coded in assembly language.

Despite its low-level capabilities, the language was designed to encourage machine-independent programming. A standards-compliant and portably written C program can be compiled for a very wide variety of computer platforms and operating systems with little or no change to its source code. The language has become available on a very wide range of platforms, from embedded microcontrollers to supercomputers.

Minimalism
C's design is tied to its intended use as a portable systems implementation language. It provides simple, direct access to any addressable object , and its source-code expressions can be translated in a straightforward manner to primitive machine operations in the executable code. Some early C compilers were comfortably implemented on PDP-11 processors having only 16 address bits. C compilers for several common 8-bit platforms have been implemented as well.

Characteristics
Like most imperative languages in the ALGOL tradition, C has facilities for structured programming and allows lexical variable scope and recursion, while a static type system prevents many unintended operations. In C, all executable code is contained within functions. Function parameters are always passed by value. Pass-by-reference is simulated in C by explicitly passing pointer values. Heterogeneous aggregate data types (struct) allow related data elements to be combined and manipulated as a unit. C program source text is free-format, using the semicolon as a statement terminator .

C also exhibits the following more specific characteristics:

variables may be hidden in nested blocks
partially weak typing; for instance, characters can be used as integers
low-level access to computer memory by converting machine addresses to typed pointers
function and data pointers supporting ad hoc run-time polymorphism
array indexing as a secondary notion, defined in terms of pointer arithmetic
a preprocessor for macro definition, source code file inclusion, and conditional compilation
complex functionality such as I/O, string manipulation, and mathematical functions consistently delegated to library routines
A relatively small set of reserved keywords
A lexical structure that resembles B more than ALGOL, for example:
{ ... } rather than either of ALGOL 60's begin ... end or ALGOL 68's ( ... )
= is used for assignment (copying), like Fortran, rather than ALGOL's :=
== is used to test for equality (rather than .EQ. in Fortran, or = in BASIC and ALGOL)
Logical "and" and "or" are represented with && and || in place of ALGOL's and ; note that the doubled-up operators will never evaluate the right operand if the result can be determined from the left alone (this is called short-circuit evaluation), and are semantically distinct from the bit-wise operators & and |
However Unix Version 6 & 7 versions of C indeed did use ALGOL's and ASCII operators, but for determining the infimum and supremum respectively.
a large number of compound operators, such as +=, ++, etc. (Equivalent to ALGOL 68's +:= and +:=1 operators)

Absent features
The relatively low-level nature of the language affords the programmer close control over what the computer does, while allowing special tailoring and aggressive optimization for a particular platform. This allows the code to run efficiently on very limited hardware, such as embedded systems.

C does not have some features that are available in some other programming languages:

No nested function definitions
No direct assignment of arrays or strings (copying can be done via standard functions; assignment of objects having struct or union type is supported)
No automatic garbage collection
No requirement for bounds checking of arrays
No operations on whole arrays
No syntax for ranges, such as the A..B notation used in several languages
Prior to C99, no separate Boolean type (zero/nonzero is used instead)
No formal closures or functions as parameters (only function and variable pointers)
No generators or coroutines; intra-thread control flow consists of nested function calls, except for the use of the longjmp or setcontext library functions
No exception handling; standard library functions signify error conditions with the global errno variable and/or special return values, and library functions provide non-local gotos
Only rudimentary support for modular programming
No compile-time polymorphism in the form of function or operator overloading
Very limited support for object-oriented programming with regard to polymorphism and inheritance
Limited support for encapsulation
No native support for multithreading and networking
No standard libraries for computer graphics and several other application programming needs
A number of these features are available as extensions in some compilers, or are provided in some operating environments (e.g., POSIX), or are supplied by third-party libraries, or can be simulated by adopting certain coding disciplines.

Undefined behavior
Many operations in C that have undefined behavior are not required to be diagnosed at compile time. In the case of C, "undefined behavior" means that the exact behavior which arises is not specified by the standard, and exactly what will happen does not have to be documented by the C implementation. A famous, although misleading, expression in the newsgroups comp.std.c and comp.lang.c is that the program could cause "demons to fly out of your nose". Sometimes in practice what happens for an instance of undefined behavior is a bug that is hard to track down and which may corrupt the contents of memory. Sometimes a particular compiler generates reasonable and well-behaved actions that are completely different from those that would be obtained using a different C compiler. The reason some behavior has been left undefined is to allow compilers for a wide variety of instruction set architectures to generate more efficient executable code for well-defined behavior, which was deemed important for C's primary role as a systems implementation language; thus C makes it the programmer's responsibility to avoid undefined behavior, possibly using tools to find parts of a program whose behavior is undefined. Examples of undefined behavior are:

*accessing outside the bounds of an array
*overflowing a signed integer
*reaching the end of a non-void function without finding a return statement, when the return value is used
*reading the value of a variable before initializing it

These operations are all programming errors that could occur using many programming languages; C draws criticism because its standard explicitly identifies numerous cases of undefined behavior, including some where the behavior could have been made well defined, and does not specify any run-time error handling mechanism.

Invoking fflush() on a stream opened for input is an example of a different kind of undefined behavior, not necessarily a programming error but a case for which some conforming implementations may provide well-defined, useful semantics as an allowed extension. Use of such nonstandard extensions generally limits software portability.

Related languages in C

C has directly or indirectly influenced many later languages such as Java, Perl, PHP, JavaScript, LPC, C# and Unix's C Shell. The most pervasive influence has been syntactical: all of the languages mentioned combine the statement and expression syntax of C with type systems, data models and/or large-scale program structures that differ from those of C, sometimes radically.

When object-oriented languages became popular, C++ and Objective-C were two different extensions of C that provided object-oriented capabilities. Both languages were originally implemented as source-to-source compilers -- source code was translated into C, and then compiled with a C compiler.

Bjarne Stroustrup devised the C++ programming language as one approach to providing object-oriented functionality with C-like syntax. C++ adds greater typing strength, scoping and other tools useful in object-oriented programming and permits generic programming via templates. Nearly a superset of C, C++ now supports most of C, with a few exceptions .

Unlike C++, which maintains nearly complete backwards compatibility with C, the D language makes a clean break with C while maintaining the same general syntax. It abandons a number of features of C which Walter Bright considered undesirable, including the C preprocessor and trigraphs. Some, but not all, of D's extensions to C overlap with those of C++.

Objective-C was originally a very "thin" layer on top of, and remains a strict superset of, C that permits object-oriented programming using a hybrid dynamic/static typing paradigm. Objective-C derives its syntax from both C and Smalltalk: syntax that involves preprocessing, expressions, function declarations and function calls is inherited from C, while the syntax for object-oriented features was originally taken from Smalltalk.

Limbo is a language developed by the same team at Bell Labs that was responsible for C and Unix, and while retaining some of the syntax and the general style, introduced garbage collection, CSP based concurrency and other major innovations.

Python has a different sort of C heritage. While the syntax and semantics of Python are radically different from C, the most widely used Python implementation, CPython, is an open source C program. This allows C users to extend Python with C, or embed Python into C programs. This close relationship is one of the key factors leading to Python's success as a general-use dynamic language.

Perl is another example of a popular programming language rooted in C. However, unlike Python, Perl's syntax does closely follow C syntax. The standard Perl implementation is written in C and supports extensions written in C.

Programming paradigm

A programming paradigm is a fundamental style of computer programming. Paradigms differ in the concepts and abstractions used to represent the elements of a program and the steps that compose a computation .

Programming model. Abstraction of a computer system, for example the "von Neumann model" used in traditional sequential computers. For parallel computing, there are many possible models typically reflecting different ways processors can be interconnected. The most common are based on shared memory, distributed memory with message passing, or a hybrid of the two.

A programming language can support multiple paradigms. For example programs written in C++ or Object Pascal can be purely procedural, or purely object-oriented, or contain elements of both paradigms. Software designers and programmers decide how to use those paradigm elements.

In object-oriented programming, programmers can think of a program as a collection of interacting objects, while in functional programming a program can be thought of as a sequence of stateless function evaluations. When programming computers or systems with many processors, process-oriented programming allows programmers to think about applications as sets of concurrent processes acting upon logically shared data structures.

Just as different groups in software engineering advocate different methodologies, different programming languages advocate different programming paradigms. Some languages are designed to support one particular paradigm (Smalltalk supports object-oriented programming, Haskell supports functional programming), while other programming languages support multiple paradigms (such as Object Pascal, C++, C#, Visual Basic, Common Lisp, Scheme, Perl, Python, Ruby,Oz and F Sharp).

Many programming paradigms are as well known for what techniques they forbid as for what they enable. For instance, pure functional programming disallows the use of side-effects; structured programming disallows the use of the goto statement. Partly for this reason, new paradigms are often regarded as doctrinaire or overly rigid by those accustomed to earlier styles. Avoiding certain techniques can make it easier to prove theorems about a program's correctness—or simply to understand its behavior.

Multi-paradigm programming language

A multi-paradigm programming language is a programming language that supports more than one programming paradigm. As Leda designer Timothy Budd holds it: The idea of a multiparadigm language is to provide a framework in which programmers can work in a variety of styles, freely intermixing constructs from different paradigms. The design goal of such languages is to allow programmers to use the best tool for a job, admitting that no one paradigm solves all problems in the easiest or most efficient way.

An example is Oz, which has subsets that are a logic language (Oz descends from logic programming), a functional language, an object-oriented language, a dataflow concurrent language, and more. Oz was designed over a ten-year period to combine in a harmonious way concepts that are traditionally associated with different programming paradigms.

Initially, computers were hard-wired and then later programmed using binary code that represented control sequences fed to the computer CPU. This was difficult and error-prone. Programs written in binary are said to be written in machine code, which is a very low-level programming paradigm.

To make programming easier, assembly languages were developed. These replaced machine code functions with mnemonics and memory addresses with symbolic labels. Assembly language programming is considered a low-level paradigm although it is a 'second generation' paradigm. Even assembly languages of the 1960s actually supported library COPY and quite sophisticated conditional macro generation and pre-processing capabilities. They also supported modular programming features such as CALL (subroutines), external variables and common sections (globals), enabling significant code re-use and isolation from hardware specifics via use of logical operators as READ/WRITE/GET/PUT. Assembly was, and still is, used for time critical systems and frequently in embedded systems.

The next advance was the development of procedural languages. These third-generation languages use vocabulary related to the problem being solved. For example,

COBOL (Common Business Oriented Language) - uses terms like file, move and copy.
FORTRAN (FORmula TRANslation) and
ALGOL (ALGOrithmic Language) - both using mathematical language terminology,
were developed mainly for commercial or scientific and engineering problems, although one of the ideas behind the development of ALGOL was that it was an appropriate language to define algorithms.

PL/1 (Programming language 1) - a hybrid commercial/scientific general purpose language supporting pointers
BASIC (Beginners All purpose Symbolic Instruction Code) - was developed to enable more people to write programs.
All these languages follow the procedural paradigm. That is, they describe, step by step, exactly the procedure that should, according to the particular programmer at least, be followed to solve a specific problem. The efficacy and efficiency of any such solution are both therefore entirely subjective and highly dependent on that programmer's experience, inventiveness and ability.

Later, object-oriented languages (like Simula, Smalltalk, Eiffel and Java) were created. In these languages, data, and methods of manipulating the data, are kept as a single unit called an object. The only way that a user can access the data is via the object's 'methods' (subroutines). Because of this, the internal workings of an object may be changed without affecting any code that uses the object. There is still some controversy by notable programmers such as Alexander Stepanov, Richard Stallman and others, concerning the efficacy of the OOP paradigm versus the procedural paradigm. The necessity of every object to have associative methods leads some skeptics to associate OOP with Software bloat. Polymorphism was developed as one attempt to resolve this dilemma.

Since object-oriented programming is considered a paradigm, not a language, it is possible to create even an object-oriented assembler language. High Level Assembly (HLA) is an example of this that fully supports advanced data types and object-oriented assembly language programming - despite its early origins. Thus, differing programming paradigms can be thought of as more like 'motivational memes' of their advocates - rather than necessarily representing progress from one level to the next. Precise comparisons of the efficacy of competing paradigms are frequently made more difficult because of new and differing terminology applied to similar (but not identical) entities and processes together with numerous implementation distinctions across languages.

Independent of the imperative branch based on procedural languages, declarative programming paradigms were developed. In these languages the computer is told what the problem is, not how to solve the problem - the program is structured as a collection of properties to find in the expected result, not as a procedure to follow. Given a database or a set of rules, the computer tries to find a solution matching all the desired properties. The archetypical example of a declarative language is the fourth generation language SQL, as well as the family of functional languages and logic programming.

Functional programming is a subset of declarative programming. Programs written using this paradigm use functions, blocks of code intended to behave like mathematical functions. Functional languages discourage changes in the value of variables through assignment, making a great deal of use of recursion instead.

The logic programming paradigm views computation as automated reasoning over a corpus of knowledge. Facts about the problem domain are expressed as logic formulas, and programs are executed by applying inference rules over them until an answer to the problem is found, or the collection of formulas is proved inconsistent.

Programming Fundamentals

What is programming?

When we want a computer to perform a specific task, such as generating a marks sheet or a salary slip, we have to create a sequence of instructions in a logical order that a computer can understand and interpret. This sequence of instructions is called a program. The process of writing programs is called programming.

The task of programming involves a lot of effort and careful planning. Without this, the computer will produce erroneous results. The following steps should go into the planning of program:

Ø Defining and analyzing the problem

Ø Developing the solution logically using an algorithm

Defining and analyzing the problem

Before writing a program, we have to define exactly what

1. data we need to provide (input) and

2. information we want the program to produce (the output).

Once we know these, we can figure out how to develop the solution.

Deciding on input

Suppose we want to write a program to work out the total and average of a student’s marks in five subjects, we would need to mention the marks in the five subjects as input.

Deciding on output

Next, we have to think of the output — the elements that should be displayed and those that should not. In the marks example, since the task is to prepare a marks sheet, the marks in all the five subjects, their total and average should be displayed on the screen.

Developing a solution logically

Once we have defined and analyzed the problem — decided on the output and the input — we can go on to develop the solution.

The most important aspect of developing the solution is developing the logic to solve the problem. This requires creating a set of step-by-step instructions and/or rules called an algorithm. Each step performs a particular task. We can write these steps in plain English.. The algorithm for the example on finding total marks and average would look like this:

1. Note down the student’s marks in different subjects.

2. Find the total marks scored by the student.

3. Compute the average marks.

4. Assign grade.

5. Display average percentage of marks and grade.

6. End.

For any computer program to work well, it has to be written properly. Formulating an effective algorithm will make writing an effective program easier. For an algorithm to be effective, it has to have the following characteristics:

1. Finiteness: An algorithm should terminate after a fixed number of steps.

2. Definiteness: Each step of the algorithm should be defined precisely. There should be no ambiguity.

3. Effectiveness: All the operations in the algorithm should be basic and be performed within the time limit.

4. Input: An algorithm should have certain inputs.

5. Output: An algorithm should yield one or more outputs that are the result of operations performed on the given input.

The famous mathematician, D.E. Knuth, first expressed these characteristics.

Programming: The next step after developing an algorithm

Once we develop the algorithm, we need to convert it into a computer program using a programming language (a language used to develop computer programs). A programming language is entirely different from the language we speak or write. However, it also has a fixed set of words and rules (syntax or grammar) that are used to write instructions for a computer to follow.

Programming languages can be divided into three types. They are:

1. Machine language

This is the basic language understood by a computer. This language is made up of 0’s and 1’s. A combination of these two digits represents characters, numbers, and/or instructions. Machine language is also referred to as binary language.

2. Assembly language

This language uses codes such as ADD, MOV, and SUB to represent instructions. These codes are called mnemonics. Though these codes have to be memorized, assembly language is much easier to use than machine language.

3. High-level languages

High-level languages such as BASIC, FORTRAN, C, C++, and JAVA are very much easier to use than machine language or assembly language because they have words that are similar to English.

A quick comparison of programming languages

	Machine Language	Assembly Language	High-level Languages
Time to execute	Since it is the basic language of the computer, it does not require any translation, and hence ensures better machine efficiency. This means the programs run faster.	A program called an ‘assembler’ is required to convert the program into machine language. Thus, it takes longer to execute than a machine language program.	A program called a compiler or interpreter is required to convert the program into machine language. Thus, it takes more time for a computer to execute.
Time to develop	Needs a lot of skill, as instructions are very lengthy and complex. Thus, it takes more time to program.	Simpler to use than machine language, though instruction codes must be memorized. It takes less time to develop programs as compared to machine language.	Easiest to use. Takes less time to develop programs and, hence, ensures better program efficiency.

Developing a computer program

Follow the steps given below to become a successful programmer:

Define the problem: Examine the problem until you understand it thoroughly.

Outline the solution: Analyze the problem.

Expand the outline of the solution into an algorithm: Write a step-by-step procedure that leads to the solution.

Test the algorithm for correctness: Provide test data and try to work out the problem as the computer would. This is a critical step but one that programmers often forget.

Convert the algorithm into a program: Translate the instructions in the algorithm into a computer program using any programming language.

Document the program clearly: Describe each line of instruction or at least some important portions in the program. This will make the program easy to follow when accessed later for corrections or changes.

Run the program: Instruct the computer to execute the program. The process of running the program differs from language to language.

Debug the program: Make sure that the program runs correctly without any errors or bugs as they are called in computer terminology. Finding the errors and fixing them is called debugging. Don’t get depressed when bugs are found. Think of it as a way to learn.

COMPUTER PROGRAMMING