From Fedora Project Wiki
Line 57: Line 57:




== Solution: ==
= Solution: =
1/ Because the test input number was high there were only slight differences between bash and c implementation.
1/ Because the test input number was high there were only slight differences between bash and c implementation.
The winning solution use formula for the computing of the sum ([[File:02first.c]]).
The winning solution use formula for the computing of the sum ([[File:02first.c]]).

Revision as of 13:09, 29 September 2009

Watt hunting competition

Winners:

1. Dan Mach
2. Pavel Moravec
3. Tomas Janousek

(from 24 competitors)

Problem description:

We have three tasks - which have to be implement in 45 minutes.

Then we measure only the time of the run of program on test data set (we want to measure disk accesses too, but scomes tool was not stable enough and that there were only small differences in disk accesses in the solution we use for tests in comparison with the time differences). The input data were quite large to emphasize the differences between algorithms and implementation changes.

Then we compare the power consumption of all implementations (there were a penalisation for not implementing a task 20ms).

The differences between the disc accesses were not to high in our case so we omit to measure them.


Tasks are:

1/ Implement a program which input is a positive number less then 10 000 (we will denote the number n) and the output is 1+2+3+...+(n-1) The return code have to be 0.

Example:

 	$ ./first 5
	10

2/ Implement a program, which input is a file name and this program put the file on output - but omit all white spaces (' ', '\t', '\n'). The return code have to be 0.

Example:

 	$ cat jmeno.txt
	. . . .
	..
	Ahoj svete
	... ...
	$ ./second jmeno.txt
	......Ahojsvete......

3/ Implement a program which input is file name and which return the number of occurrences of string "file" in input file. The string have to be separated with white space (' ', '\t', '\n') on the beginning and at the end. The return code have to be 0.

Example:

 	$ cat text.txt
	123 file files hula
	file pes
	pfile les
	$ ./third text.txt
	2


Solution:

1/ Because the test input number was high there were only slight differences between bash and c implementation. The winning solution use formula for the computing of the sum (File:02first.c).

2/ In c there is several methods to implement this tasks the slower was fscanf (File:02second.c), faster variants use getc (File:01second.c) and the fastest one uses fread (File:03second.c). Bash version which uses sed parser is rather slow, but the one using tr command is really fast.

3/ In c problem could use several methods either go through whole file and the fastest was using fscanf (File:03third.c), better version use some mechanism to find the occurrence of f character at first and then test whether it is part of file word (or some similar mechanism) (File:01third.c). The best solution uses mmap and order the tests based on the percent occurrence of character in text (File:02third.c). In bash the winning strategy parse the output using grep command, and then use tr to find the proper number sed could be used too but it was slower again. Here could be used Boyer-Moore algorithm - but it is problem to have it in time (nobody use it), and the string is too short so the results are not so good.

In all cases there is fine to use come optimisation flags. There are another methods how to speed up the solution (use fgetc_unlock and so on), nobody use it this competition (but feel free to try to test it :) ).