Watt hunting competition

We have three tasks - which have to be implement in 45 minutes.

Then we measure only the time of the run of program on test data set (we want to measure disk accesses too, but scomes tool was not stable enough and that there were only small differences in disk accesses in the solution we use for tests in comparison with the time differences). The input data were quite large to emphasize the differences between algorithms and implementation changes.

Then we compare the power consumption of all implementations (there were a penalisation for not implementing a task 20ms).

The differences between the disc accesses were not to high in our case so we omit to measure them.

Tasks are:

1/ Implement a program which input is a positive number less then 10 000 (we will denote the number n) and the output is 1+2+3+...+(n-1) The return code have to be 0.

Example:

 	$ ./first 5
	10

2/ Implement a program, which input is a file name and this program put the file on output - but omit all white spaces (' ', '\t', '\n'). The return code have to be 0.

Example:

 	$ cat jmeno.txt
	. . . .
	..
	Ahoj svete
	... ...
	$ ./second jmeno.txt
	......Ahojsvete......

3/ Implement a program which input is file name and which return the number of occurrences of string "file" in input file. The string have to be separated with white space (' ', '\t', '\n') on the beginning and at the end. The return code have to be 0.

Example:

 	$ cat text.txt
	123 file files hula
	file pes
	pfile les
	$ ./third text.txt
	2

Solution:

1/ Because the test input number was high there were only slight differences between bash and c implementation. The winning solution use formula for the computing of the sum (File:02first.c).

2/ In c there is several methods to implement this tasks the slower was fscanf (File:02second.c), faster variants use getc (File:01second.c) and the fastest one uses fread (File:03second.c). Bash version which uses sed parser is rather slow, but the one using tr command is really fast.

3/ In c problem could use several methods either go through whole file and the fastest was using fscanf (File:03third.c), better version use some mechanism to find the occurrence of f character at first and then test whether it is part of file word (or some similar mechanism) (File:01third.c). The best solution uses mmap and order the tests based on the percent occurrence of character in text (File:02third.c). In bash the winning strategy parse the output using grep command, and then use tr to find the proper number sed could be used too but it was slower again.

Search

DeveloperConference2009/WHcompetition

Watt hunting competition

Solution: