Things have been working quite well with the project, with a few exceptions (some bugs). I managed to implement the execution restore. To do this, I started by logging the finished processes and adding them to a file. Restoring the execution consists of continuing from where the system crashed. While working on this I discovered a bug caused by old processes that were not properly removed from memory. This caused a segmentation fault in some rare cases because of signals emitted after the deletion of the process. The fix was simple: use deleteLater() to remove the instances, but it took me a lot to figure this out.
Moving on, I started working on the logging system. After a talk with Veaceslav Munteanu (my mentor) and Grigore Lupescu (Veaceslav’s mentor, from the previous RSoC) we decided that we should use an already existing logging system. Grigore suggested Nagios. The first step was preparing the IDE for a logging system. WHC::IDE had no way of telling how a process ended or, if it crashed, what was the cause of the crash.
I added this functionality, and while I was at it, I saw a way to improve the execution restore. This is closely related to how tasks run in WHC::IDE, so I am going to briefly explain it.
When you click on the “Run” button, the IDE performs a topological sort that establishes the order in which to run the tasks. A task may have more processes associated: for every input file combination, there is a corresponding process. The output from that process could be used by more tasks, but it is a waste of resources to run it for each task or folder that needs the output. The way the IDE does it is by running only once and putting the output in a temporary folder. After that, it copies that folder or gives it to other tasks that require it. There are a lot of IO operations involved so this could go wrong in a lot of cases. The improvement I saw possible was the following: the IDE will add the process to the list of ran processes, only this time it will mark it as an “IOError process”. When the user wants to restore the execution, WHC:IDE will go back to the temp file and retry the IO operations, but it will not run the process again.
After completing this, and working with the execution class, I saw yet another way I could improve the project. I am talking about the execution speed on machines with multiple devices. WHC::IDE can run the same task in parallel, each device running a process with different inputs. But what happens when you have many tasks with one input that can run in parallel? Well, they could all run on different devices with a small adjustment in the running algorithm. The problem with this “small adjustment” is that it requires A LOT of code refactoring. I started trying different approaches, but they are not ready to be committed to the project because they break some things.
Going back to the logging system (sorry for getting so distracted), I currently have some problems with getting Nagios to run on my machine. Also it doesn’t run on Windows, and one of the project’s goals is to make it cross platform. I am starting to believe that the way to go is by writing something new and lightweight for our project.
Next time I will tell you more about the logging system (I will get it working by then) and also about the editor