CSCI212 Assignment 2

This assignment consists of two tasks, in fact, it is a combination of task 1 and task 2. 

Task 1

Task 1 requires to develop his own shell (it is called MyUnix in this program) that can work exactly the same as Unix or Unix-liked terminal.

In task 1, it is to design and implement a simple, interactive shell program that prompts the user for a command, parses the command and executes it with a child process.

See example below:

myUnix>
myUnix>
myUnix>ps -ef | more
UID    PID PPID C STIME  TTY      TIME CMD
user1 6894    1 0 11:14      ?    00:00:02 gnome-terminal
user1 6904 6894 0 11:14      ?    00:00:00 gnome-pty-helper
user1 6907 6894 0 11:14   pts/0   00:00:00 bash
user1 7443 6907 0 11:24   pts/0   00:00:00 ./simpshell
--more—

Like conventional shells, your new shell interpreter's command lines has the form:  
>command argument_1 argument_2 ...
where the command to be executed is the first word in the command line and remaining words are arguments expected by that command. Note that arguments will have to be passed to the execl system call or equivalent with path as well. So you will have to do some basic parsing. The number of arguments depends on the command which is being executed.

The shell relies on an important convention to accomplish its task: the command is usually the name of a file that contains an executable program. For instance, the command ls() and ps() are the names of the files (stored in /bin on most UNIX style machines). In a few cases, the command is not a file name, but is actually a command that is implemented within the shell - e.g. cd() is usually implemented within the shell rather than in a file. Since the vast majority of the commands are implemented in files, just think that the commands are filenames in some directory on the machine. So the job of the shell is to find the file, prepare the list of parameters for the command and the cause the command to be executed using the parameters.

A shell could use many different strategies to execute the user's computation. However the basic approach used in modern shells is to create a new process to execute any new computation.

This idea of creating a new process to execute a computation may seem like overkill, but it has a very important characteristic. When the original process decides to execute a new computation, it protects itself from fatal errors that might arise during that execution. If it did not use a child process to execute the command, a chain of fatal errors could cause the initial process to fail, thus crashing the entire machine. (I hope you don't create too many child process and drain all the memory resources.. )

Here I will introduce you a few functions that you might require to use for this task. 

fork() function
 
fork is a standard UNIX system call used to create a new process. Whenever a process issue fork system call, a new process will be created. The process who call for fork is “parent” and a new process created by the system is “child” process.

These are entirely different (different PID) which means different memory space. It uses copy on write semantics (page sharing).
 
fork is important because it encourages the development of filters. Filter is a small program that reads its input from STDIN and write its output to STDOUT. A pipeline of these commands can be strung together to create new command.
 
e.g. $ find –name “*.cpp” –print | wc -1
find and wc are the child process.
 
exec() function
 
It is a common technique used in UNIX together with fork and exec command. Fork is the name of system call that the parent process uses to divide into two identical process. After fork() is called, “child” is created with exact copy of parent. But they are with different PID.
 
The fork function return child PID to the parent while it return 0 to child so that they can distinguish from each other. The parent process can either continue or wait for child process to complete.
 
The child, after discover that it is child, replaces itself with another program. While the child calls exec(), all data is lost and replaced with running copy of the new program. If the parent chose to wait for child, then the parent will receive exit code that child executes.

wait() function

It is a system called used in parent where the parent wait for the child’s process to complete. During the wait() time, the parent will do nothing except for waiting child’s signal.

Task 2

In task 2, it is to implement a pthreads program for calculating the matrix multiplication. The program should be flexible enough to change the size of the matrices and set the number of threads. The numbers are randomly generated. After completion, this program will now take the input numbers from a file instead of the random algorithm. To enhance the difficulty, the program is modified to take 2 square matrices for multiplication from a file (e.g. infile) and then its matrix computation output to another newly display (progB) program to just merely transform the presentation in another way.

For example:

myUnix>
myUnix>./MatrixMulti 1 > result.txt
 
The output file, result.txt should contain the product of Matrix Multiplication as shown
below:

180 84 168           118 102 135          60720 50448 67308
171 195 42    *       78 110 204      =   43620 44604 69333
 71 164 59           196 136 154          32734 33306 52127

The solution of task 2 is about the usage of threading. Threads are used to share the memory and execute simultaneously to make the process faster. It is a support to parallel programming. For UNIX, threads programming interfaces are defined as POSIX or Pthreads.

Threading requires thread initialization and thread definition/setting. It is then thread is created and destroyed at the end.

See the diagram how the threads are worked.


JOIN & DETACH threads

"Join" is one way to accomplish synchronization between threads. A joining thread can match one pthread_join() call. When a thread is created, one of its attributes defines whether it is joinable or detachable. Only threads that are created as joinable can be joined. If a thread is created as a detached thread, it can never be joined.

MUTEX

Mutex is an abbreviation for "mutual exclusion". It is a simplest mechanism we deploy to enforce concurrency between threads.

A mutex variable acts like a "lock" protecting access to a shared data resource. The basic concept of a mutex is that only one thread can lock (or own) a mutex variable at any given time. Threads must "take turns" to access protected data.


No comments:

Post a Comment