Computer Aided Assessment of PROGRAMS

Alessandro Cucchiarelli, Danilo Luzi, Maurizio Panti, Salvatore Valenti

Computer Science Dept. - University of Ancona - Italy

email: {Alex, Panti, Valenti}@inform.unian.it

Keywords: Computer AidedAssessment, Assessmentof Programs, Static Software Quality

1. Introduction

The majority of large universities offer a portion oftheir required freshman classes in lectures of hundred or more students [1,2, 6]. Bjedov, in [1] reports about teaching "Introduction to C Programming" in a class whose enrolment varies between 360 students inspring semesters and 940 students in fall semesters. At the University ofUlster class sizes in excess of 100 are common for popular entry levelmodules in computer programming and hardware topics [6]. Similar numbers are reported in Italy at the Universitiesof Ca' Foscari (Venice) and Ancona [2, 8] and in Brazil at the Universityof Pernanbuco [3] leading to the conclusion that the rising size and thediversity of teaching groups in high level education is a world-wide problem.

Dealing with large classes rises a number of problems both from thelecturer and the students point of view.

Teaching large classes is often seen as a difficult and unwelcomeassignment. Moreover the professor is able to know but a limited number of students and since the lecturesexchange in a short interval of time only a few of the students withquestions about the material can be helped. The same considerations holdfor the support that may be given in officehours, either. Furthermore, exams cannot be given to all the students atthe same time due to lack of resources: this often reflects in exams beinggraded by different people with differences in grading styles that maybecome relevant regardless with any"grading blending" policy.

The freshmen, on their side, must cope with a new evaluation approachthat is quite different from the one used in high schools: frequently theonly possibility to verify the results of the study activity consists injust one final examination at the end of lesson’s period without the chance ofany check-point during the study period. Such a situation is perceived asa "big problem" in the first or second year of the curriculum since it often resultseither in delaying the student career or in poor grading.

The possibility of doing self testing activity at any time and in anysite is feasible using the Internet medium: this has lead to thedevelopment of a number of commercial application and academic projects inthe field of Computer Assisted Assessment [9] and Distributed Learning [11].

In the following sections of this paper a system for self assessmentbased on the W3 technology for the course of"Fondamenti di Informatica" (Foundations of Computer Science) developed atthe Computer Science Dept. of the University of Ancona, will be presented. Morein detail, the next section discusses the system architecture as resultsfrom his implementation used in a stand-alone release while the followingsection is devoted to the presentation of the interface to the new release (actually under development)that will be accessible via Internet.

2. TheProgram Assessment Module

To allow the student self evaluation of his Pascal/Cprogramming capability, we have set up a verification procedure based onthree steps:

- definition of a problem the student is required to solve through a Pascal/C program;

- language selection and program coding by the student;

- automatic evaluation of the program and result report.

The firststep of the procedure is carried out by the assessment module, which automatically selectsone of the problems stored in its internal database, fed by the tutor. Eachproblem is clearly defined by a text containing a description of theproblem itself along with the data structures requested, and a sketch of the algorithm functional components.The selection procedure used by the module is based on a classification ofproblems by argument and, within each argument, on a sub-classification onincreasing level of difficulty.Each student, after the login phase, selects the topic of the test among apredefinite list, leaving the assessment module the task of problemidentification. The module selects the test that follows the last onesolved in the difficulty level hierarchy, according to the history of the student's previous logins. The text of theproblem is then shown on screen and the student can move on to the secondstep of the procedure.

Through a dedicated interface, it is possible to enter theprogramming environment, and code the solution in the selected language. This is a standardfull-function environment, with a set of tools (editor, compiler, linkerand runner) that a student can use to write and test her/his program. Noexplicit limit on compilation or run attempts are defined in this phase: there is only a limited amount of time (onehour) to give the problem solution. When the time expires, or when thestudent notifies that she/he completed the task, the last version of thesource program is taken as the one to evaluate.

The following step performed by the module is the automaticevaluation of the student source code, and it is activated only if theprovided code is syntactically correct. It is based on a functional test,devoted to the verification of the functional correctness of the program, followed by a static software qualityevaluation, founded on the application of four quality factors:correctness, readability, simplicity and flexibility.

In more detail, both the phases are based on the comparative analysisbetween the student program and a reference solution program provided bythe tutor and stored in the internal problems database.

The functional test make a series of runs (the number is chosen bythe tutor) of the program being tested, providing predefined input data for each of them, and comparing the result obtained with thecorrect ones. The overall evaluation is expressed in terms of thepercentage of correct runs (i.e. those giving the right results). Thestudent report produced at the end of this phase includes, not only the overall evaluation score, but the indication ofthe nature of the errors related to each single run, if any, as provided bythe program running environment.

In order to evaluate the quality of the student program, the source codeis analysed by a software function able to compute the value of the fourquality factors, starting with the same parameters estimated with respectto the reference solution. The closer the student's program structure is tothe reference, the higher the score is. Details on the used parameters nature can be found in the nextsection. After the completion of the evaluation, the report sent to thestudents includes, along with the global score, the value assigned to eachparameter and the source code of the reference solution [13,14].

3. The Static Software Quality Evaluator: VA.S.Qu.S

Effective softwaremeasurement and meaningful data interpretation depend on the recognition ofthe essential duality the entire measurement process.

Therefore measurement involves the definition of two models: the empirical real-world context, inwhich the measurement is to take place, and a numerical model incorporatingwell defined measurement based aspects of the empirical model. Measurementtheory is founded on the definition of formal mappings between the two models and the selection of theappropriate measurement scale [14].

Va.S.Qu.S. analizes and evaluates the source code according to a setof predefinite static properties, scoring each of them as percentage. Suchscores are then associated to a quality factor that is still expressed as apercentage, and that represents a satisfaction index of the consideredproperty by the source code under analysis.

The association of a generic property score Pi to the related quality factors Qiis obtained by the application of the expression Qi= fi(Pi), where fi is oneof the three different functions shown as curves in table 1.

Table 1 - Qualityfactors Vs scores: the three different curves

As can be seen fromtable 1, the three curves are characterised by the the parametersPL0, PSi,PSs and PH0, that divide the variability range of the scores insubintervals in which the curves may show increasing, decreasing,saturating, desaturating, or constant behaviour. Furthermore the value ofthe parameter Qx representing the values given by the curve in the stationary intervals PÎ (PSi,PSs) are to be taken into account.

Note that the curves depicted in table 2 have been obtained bymodifying the values of the discussed parameters until they reach limitconditions.

Table 2 - Quality factors Vs scores: some limit curves

Va.S.Qu.S. introducesthree more parameters to enhance the flexibility of its application todifferent evaluation contexts: Tag1,Tag2 and MinValue.

The constant values Tag1 and Tag2 define the range of applicability of each property scoring function, thatcontributes to the quality measure only for values within the range, and isnot taken into account for other values. Tag1and Tag2 are optional parameter in the scoring functionspecification: none of them may be defined, so accepting in the qualitymeasure any possible value of the function, or only one of them may beused, so taking into account only scores lower than Tag1 or higher than Tag2.

Table 3 summarizes the relation between the values of the scorefunction Pi and those accepted for the qualitymeasure P'i in case of bothTag1 and Tag2, or only Tag1 orTag2 specification.

The last parameter, MinValue, defines a minimal threshold for qualityfactor acceptability of each property measured in the source code: anyQi score under MinValuei is regarded as unsatisfactory.

Table 3-Pi Vs. P'i mapping

Va.S.Qu.S. incorporatesan analyser that measures static attributes of source codes that belong tothe following classes: correctness, readability, simplicity, flexibility.

Correctness is the stylistic property of the code to suit the generalrules defined for the correct use of the programming languagecharacteristics.

Readability expresses the possibility to easily comprehend the natureof the algorithm, by the simple reading of the related code.

Simplicity is the measure both of the ease of understanding themechanisms which regulate the execution of the program, and of theirmodification.

Finally, flexibility is the ability to modify the code in a simpleand localised way.

Each of the previous attributes may be evaluated by defining a set ofparameters strictly related to the source code, as those listed in Table 4.

Correctness

Readability

Simplicity

Flexibility

use of user-defined types
interface minimisation
visibility restrictions
functionexit-points
formal param. "by copy"
formal param. "byreference"

volume of comments
indentation
lines of code length

nesting level

use of constants
use of libraryroutines
overloading of identifiers

Table 4 - StaticSoftware Quality Metrics

The interested reader should consult [14] in order to obtain detailedinformations about each parameter listed in table 4.

4. Theclient interface

We are working on the definition and implementation ofthis interface in a WWW fashion, and this section will illustrate theaspects related to this ongoing implementation. The client allows the student to access the system after an identificationprocedure whose interface is depicted in fig. 1.

The student is asked to enter the Matriculation Number along withName and Last Name. While these informations are enough forself-assessment, which has no official value, it remains open the problem of personal identification when using the softwarefor University Examination since this task appears difficult be doneautomatically in a cost-effective way with the available technology [2].

The information about Matriculation Year, Degree and Course are used both to allow different courses to use thesame client for self-assessment and for statistical purposes in order tounderstand, for instance, how the students plan to attend the course duringtheir curricula [8].

Once the student has been identified, the client application presentsin fig. 2.

The interface to the programming environment allows the student todevelop a program and to code into the selected language and to run itusing sample data which must be provided by the student itself. When the student is satisfied by the functional correctnessof the program, he/she may activate Va.S.Qu.S in order to verify the staticquality of the software developred, through the button "S.S. Quality".

Figure 1 - Interface for User Identification

The screenthat is obtained is depicted in Fig. 3.

The program developed is submitted to the tutor either manually bythe student through the button "Submit" of the screen depicted in fig. 2when he/she is satisfied with the obtained results, or automatically by the client applicationon the expiration of the time slot provided.

The report generated by the automatic evaluator is provided shortlyafterwards, through the screen depicted in fig. 4. Pressing the button "More" in the Run section of the screen, details on run time errorsgenerated during the functional analisys are shown.

The button "More" of the Quality section explodes the static softwarequality analisys for each parameter of each class.

Figure 2 - Interface to the programming environment

Actually weare working on the implementation of the client interface and on theporting of the stand alone Program Assessment Module under Linux O.S.

References

[1] G. Bjedov , "Utilizing the World Wide Web and the Internet to Facilitate Learning in LargeClasses", in ASEE/IEEE Frontiers in Education Conference 1995, D. Budny andR. Herrick eds., IEEE CS Press, 1995.

[2] A. Celentano, F. Dalla Libera, "Remote Assessment inInterNet Based Environments", in Proceedings of European Conference onNetworking Entities - NETIES '97, G. Cancellieri ed., 1997.

[3] P.C. de Azevedo Restelli Tedesco, D. da Fonseca de Sousa,"Building a Java based Intelligent Tutoring System for IntroductoryComputing", Proceedings of AISC'97, M.H. Hamza ed., 347-350, Acta Press,1997.

[4] D. Laming, "The reliability of a certain universityexamination compared with the precision of absolute judgements",Quarterly Journal of Experimental Psychology, Vol. 42, 2, 239-254, 1990.

Figure 3 - Quality Measures

[5] E.M. McCabe, I. Troise, "An integrated approach to Computer AidedAssessment", Proceedings of 4th Annual Conference on the Teaching ofComputing", www.ctc.dcu.ie/ctcweb/confer/1996/24.html, 1996.

Figure 4 - Report of Results

[6] A.A. Moore, "Remote Class Test Management with the World WideWeb", in Applied Informatics '96, M. H. Hamza ed., Acta Press, 229-231,1996.

[7] P.A. Nye, T.J. Crooks, M. Powley, G.Tripp, "StudentNote-Taking Related to University Examination Performance", HigherEducation, Vol.13, 1, 85-97,1984.

[8] M. Panti, A. Cucchiarelli, S. Valenti, "A Web - basedApproach To Automated Test Assessment," in Proceedings of EuropeanConference on Networking Entities - NETIES '97, G. Cancellieri ed.,1997.

[9] A. Peel, "Computer Aided Assessment throughhypermedia", Active Learning, 1, CTISS Publications, 1994.

[10] M.J. Reeds, "Automatic Assessment Aids for Pascal Programs",SIGPLAN Notices, 17, 10, 33-42, 1982.

[11] S. Saltzberg, S. Polyson, "Distributed Learning on theWorld Wide Web", Proceedings of IUC Computer Mediated Conference,www.umuc.edu/iuc/cmc96/papers/poly-p.html, 1996.

[12] D.L. Schroeder, M.J. Granger,"A NewGeneration of Network Technology Requires New Generation Teaching", in"Managing Information and Communications in a Changing Global Environment", M.Khosrowpour ed., Idea Group, 1995.

[13] T. Siliquini, "Determinazione e Sperimentazione di Metricheper la Qualitˆ di Programmi scritti in Linguaggio C", Degree Thesis,University of Ancona, 1998.

[14] S. Valenti, A. Cucchiarelli, M. Panti, "Automatic Assessment of StaticSoftware Quality", in "Effective Utilization and Management of EmergingInformation Technologies", M. Khosrowpour ed., Idea Group, 1998.

[15] M. Zeidner, "Key Facets of Classroom Grading : AComparison of Teacher and Student Perspectives", ContemporaryEducational Psychology, Vol. 17, 3, 224-243, 1992.