WHC::IDE #4 – Editor

Hello readers! This time I’ve been working on improving the editor. My goal is to add some basic code editing features and fix the broken ones.

I am trying to integrate kate, the kde editor, into WHC::IDE, but there are some problems that (I think) are caused by my system having both qt4 and qt5 installed. There appears to be a conflict. For some reason, the compiler chooses qt5, but the cmake files specify that qt4 is to be used.

While struggling with kate, I took some time with improving the current editor. This way I have two options in case one of them fails. I’ve added bracket matching, fixed the highlighting and made the options relevant. One of the biggest problems was the options system that would not load when opening the editor. This made it useless. I am happy with the results and very soon we will also have autoindent.

Except from the editor, I also fixed a bug caused by connecting two data diagrams. Data diagrams contain, as suggested by their name, only data files that await to be processed by a task or are the output of a task. The IDE didn’t know what to do when two data diagrams were connected and this caused problems with the execution.

WHC::IDE #3 – Logging and execution improvement

Sorry for taking such a long break from the blog. In the last three weeks I’ve been working on the logging system and, also, I’ve taken a small vacation.

Last time I was talking about me having problems with Nagios. Those problems are now gone. I tested it on my machine and it worked well, but, in the end, I decided not to use it. There are two main reasons behind this. Firstly, Nagios is a bit of an overkill for what we need. It’s too complex and it would be too much to use it just to log our processes. Secondly, it doesn’t run on Windows. (Speaking of Windows, I have problems linking the OpenCL library. Sometimes it works, other times it doesn’t.)

The system I created for logging uses ini files that store data about the project ran (one file for each run). I works in a similar way to the execution restore system. It uses the signals emitted by the QProcess and Executie classes. To create a nice interface with statistics and graphs I used QCustomPlot, a free library for plotting.

Another improvement to the project is the new execution model. Before, the execution order was created by sorting the workflow graph using DFS. All devices would run a task at a time, each device with a different input from the inputs folder(s). The new execution model can run multiple different tasks, if they are independent. It doesn’t use DFS for topological sort. Insead, for each step, it removes the tasks that have 0 dependencies from the unsorted graph and adds them to the sorted execution order.

My next goal is the editor. I will talk about it next time.

WHC::IDE #2 - execution restore and logging

Things have been working quite well with the project, with a few exceptions (some bugs). I managed to implement the execution restore. To do this, I started by logging the finished processes and adding them to a file. Restoring the execution consists of continuing from where the system crashed. While working on this I discovered a bug caused by old processes that were not properly removed from memory. This caused a segmentation fault in some rare cases because of signals emitted after the deletion of the process. The fix was simple: use deleteLater() to remove the instances, but it took me a lot to figure this out.

Moving on, I started working on the logging system. After a talk with Veaceslav Munteanu (my mentor) and Grigore Lupescu (Veaceslav’s mentor, from the previous RSoC) we decided that we should use an already existing logging system. Grigore suggested Nagios. The first step was preparing the IDE for a logging system. WHC::IDE had no way of telling how a process ended or, if it crashed, what was the cause of the crash.

I added this functionality, and while I was at it, I saw a way to improve the execution restore. This is closely related to how tasks run in WHC::IDE, so I am going to briefly explain it.

When you click on the “Run” button, the IDE performs a topological sort that establishes the order in which to run the tasks. A task may have more processes associated: for every input file combination, there is a corresponding process. The output from that process could be used by more tasks, but it is a waste of resources to run it for each task or folder that needs the output. The way the IDE does it is by running only once and putting the output in a temporary folder. After that, it copies that folder or gives it to other tasks that require it. There are a lot of IO operations involved so this could go wrong in a lot of cases. The improvement I saw possible was the following: the IDE will add the process to the list of ran processes, only this time it will mark it as an “IOError process”. When the user wants to restore the execution, WHC:IDE will go back to the temp file and retry the IO operations, but it will not run the process again.

After completing this, and working with the execution class, I saw yet another way I could improve the project. I am talking about the execution speed on machines with multiple devices. WHC::IDE can run the same task in parallel, each device running a process with different inputs. But what happens when you have many tasks with one input that can run in parallel? Well, they could all run on different devices with a small adjustment in the running algorithm. The problem with this “small adjustment” is that it requires A LOT of code refactoring. I started trying different approaches, but they are not ready to be committed to the project because they break some things.

Going back to the logging system (sorry for getting so distracted), I currently have some problems with getting Nagios to run on my machine. Also it doesn’t run on Windows, and one of the project’s goals is to make it cross platform. I am starting to believe that the way to go is by writing something new and lightweight for our project.

Next time I will tell you more about the logging system (I will get it working by then) and also about the editor :)

WHC::IDE #1 – porting to Qt5 and memory leaks

WHC::IDE is an IDE for parallel and distributed projects using OpenCL. My goals in the last two weeks have been porting to from Qt4.8 to Qt5 and fixing leaks.

I do not understand the reason for the changes that made old projects incompatible, but it wasn’t for me to decide. All I could do was to find a way to make it work again. To be honest, it wasn’t that difficult, considering there are many helpful blog posts and articles floating around the internet. I don’t have much to say about this, but if you are interested in porting old projects, the most useful resource I found was this.

After porting, the real work began. I started this project knowing that it has a huge, black hole inducing amount of memory leaks, so I was prepared for the worst. And what I saw seemed bad. After every run, valgrind would leave a 5 MB log (I redirected standard error to a file). The thing that concerned me the most was that I didn’t understand anything from the 5 MB file. The stack trace was too short and I couldn’t see which methods from my code caused the mess. After a quick search, I found the –num-callers parameter that sets the size of the stack trace. Once more, I ran the program with valgrind, this time with a much bigger stack trace limit, and started examining the log file. It showed that almost every error was caused by Qt methods. I went to Google with those errors and I learned that Qt causes a lot of false positives in valgrind and that there is a way to create a suppression file.

A suppression file is used to suppress certain errors that valgrind encounters. The good news is that the QtCreator IDE (the program I use to develop the project) already has valgrind set to suppress the false positives. Running from QtCreator showed that WHC::IDE is much better than I thought. I spent the rest of the time finding real memory leaks and segmentation faults.

Next time I will talk about restoring the running state of a project, in case of a system crash. I strongly believe I’m going to get this working by next week.