From Fedora Project Wiki
Line 1: Line 1:
= Development Tips =
= Development Tips =
== Oprofile ==
== Oprofile ==
It is often mentioned that running oprofile is more complicated than using gprof, because it has to be started a daemon and loaded a kernel module. But gprof needs recompilation of an application and dependent libraries with -pg option, which could be worse in case you need to recompile also glib library.
It is often mentioned that running oprofile is more complicated than using gprof, because it has to start a daemon and load a kernel module. But gprof needs recompilation of an application and dependent libraries with -pg option, which could be worse in case you need to recompile also glib library.
Setting and using oprofile:
Setting and using oprofile:
* [http://www.ua.kernel.org/pub/mirrors/centos.org/4.6/docs/html/rhel-sag-en-4/s1-oprofile-configuring.html configuring]
* [http://www.ua.kernel.org/pub/mirrors/centos.org/4.6/docs/html/rhel-sag-en-4/s1-oprofile-configuring.html configuring]

Revision as of 11:13, 8 December 2009

Development Tips

Oprofile

It is often mentioned that running oprofile is more complicated than using gprof, because it has to start a daemon and load a kernel module. But gprof needs recompilation of an application and dependent libraries with -pg option, which could be worse in case you need to recompile also glib library. Setting and using oprofile:

Best practices

In every good course book are mentioned problems with memory allocation, performance of some specific functions and so on. The best thing what to do is buy a good book ;-) It doesn't make sense to thing about every line, but optimize only things which are bottlenecks of performance.

Here is a short overview of techniques which are often problematic:

  • threads
  • Wake up only when necessary
  • Don't use [f]sync() if not necessary
  • Do not actively poll in programs or use short regular timeouts, rather react to events
  • If you wake up, do everything at once (race to idle) and as fast as possible
  • Use large buffers to avoid frequent disk access. Write one large block at a time
  • Group timers across applications if possible (even systems)
  • excessive I/O, power consumption, or memory usage - memleaks
  • Avoid unnecessary work/computation

And now some examples:

Threads

It is widely believed that using threads make our application performing better and faster. But it is not true every-time.

Python is using Global Lock Interpreter so the threading is profitable only for bigger I/O operations. We can help ourselves by optimizing them by unladen swallow (still not in upstream).

Perl threads were originally created for application which run on systems without fork (win32). In Perl threads are data copied for every thread (Copy On Write). The data are not shared by default, because user should be able defining the level of data sharing. For sharing data could be included module (threads::shared), then are data copied (Copy On Write) plus the module creates for them tied variables, which takes even more time, so it's even slower.

Reference: performance of threads

In C threads share the same memory, each thread has his own stack, kernel doesn't have to create new file descriptors and allocate new memory space. C can really use support of more CPUs for more threads.

Therefore, if you want have a better performance of your threads, you should be using some low language like C/C++. If you are using scripting languages, then it's possible write a binding in C. The low performing parts can be tracked down by profilers.

Reference: improving performance of your application

Wake-up

Many applications are scanning configuration files for changes. In many cases it is done by setting interval e.g. every minute. This can be a problem, because it is forcing disc to wake-up from spindown. So the best solution is find a good interval, a good algorithm or use checking changes with inotify - react on event. Inotify can check variety of changes on a file or a directory. The problem is that we have only limited numbers of watches on a system. The number can be obtained from

/proc/sys/fs/inotify/max_user_watches

and it could be changed, but it is not recommended.

Example:

int fd;
fd = inotify_init();
int wd;
/* checking modification of a file - writing into */
wd = inotify_add_watch(fd, "./myConfig", IN_MODIFY);
if (wd < 0) {
  inotify_cant_be_used();
  switching_back_to_previous_checking();
}
...
fd_set rdfs;
struct timeval tv;
int retval;
FD_ZERO(&rdfs);
FD_SET(0, &rdfs);

tv.tv_sec = 5;
value = select(1, &rdfs, NULL, NULL, &tv);
if (value == -1)
  perror(select);
else {
  do_some_stuff();
}
...

Pros:

  • variety of checking

Cons:

  • finite number of watches for a system
  • failure of inotify

In case the inotify failed, then it must be switched back to different way of checks. That is usually done by lot of "if define" in the code.

Reference:

man 7 inotify

Fsync

Fsync is known as an I/O expensive operation, but according to a reference is not completely true. The article has also interesting discussion which show many different opinions on (not) using fsync at all.

The typical examples are Firefox freeze (fsync) vs. empty files (without fsync). What's happened in these cases?

Firefox used to call the sqlite library each time the user clicked on a link to go to a new page. The sqlite called fsync and because of setting of a file system (mainly ext3 with data ordered mode), then there was a long latency when nothing was happening. This could take a long time (even 30s), if there was a large file copying by another process in the same time.

In other cases wasn't fsync used at all and there was no problem until switch to ext4 file system. The ext3 was set to data ordered mode, which flush memory every few seconds and save it on a disc. But with ext4 and laptop_mode was the interval longer and data might got to be lost, when the system was unexpectedly switched off. Now is ext4 patched, but still we should thing about design fo our application and use fsync carefully.

Let's show it on simple example of reading/writing into a configuration file how can be made a backup of a file or how can be data lost.

Bad example:

/* open and read configuration file e.g. ~/.kde/myconfig */
fd = open("./kde/myconfig", O_WRONLY|O_TRUNC|O_CREAT);
read(myconfig);
...
write(fd, bufferOfNewData, sizeof(bufferOfNewData));
close(fd);

Better example:

open("/.kde/myconfig", O_WRONLY|O_TRUNC|O_CREAT);
read(myconfig);
...
fd = open("/.kde/myconfig.suffix", O_WRONLY|O_TRUNC|O_CREAT);
write(fd, bufferOfNewData, sizeof(bufferOfNewData));
fsync; /* paranoia - optional */
...
close(fd);
rename("/.kde/myconfig", "/.kde/myconfig~"); /* paranoia - optional */
rename("/.kde/myconfig.suffix", "/.kde/myconfig");

Reference: inside of fsync