#3 DexOnline – Romanian Literature Crawler

Hi,

This week I learned how to use Idiorm, a php library for mysql databases and Iimplemented the crawler DB side. I found out that the Idiorm INSERT usage was quite obscurely implemented because I didn’t find an example on the web so I started reading the library implementation. Finally I found out that you have to use $obj = ORM::for_table(‘table_name’)->create();  to make an object with the table fields as php variables, then you have to set the coresponding variables values ($obj->field_1 = $val_1;$obj->field_n = $val_n) and finally call $obj->save(); I wrote this code because the other DexOnline intern will need it.

I also wrote a mechanism to manipulate URLs (transform relative to canonical URLs, a mechanism to find if an URL is used (hash + special cases).

I stumbled upon saving the rawPage and parsedText to the filesystem because of directory rights. I didn’t want to change the directory owner so I moved the files to /tmp/DexContent/, but it still doesn’t want to save the files. I’m using file_put_contents($filename, $string) and $filename contains only alfanumeric values and  the ‘_’ char.

2 thoughts on “#3 DexOnline – Romanian Literature Crawler

  1. Thanks a lot! Last evening I managed to bind elFinder user actions to table modifications, but did that using PDO objects, because that was the only way I knew. Can’t wait to try what you have taught me :D .

    I still have one question: If I want to insert multiple rows, do I have to make ‘field_n’ an array, or do I have to change its value each time before calling ‘save()’ method?

    • You’re welcome and I’m so sorry I did not see your question, my bad. I assume you answered your question a long time ago. If you want to insert multiple rows, you can’t. You can use raw_query($query) for that, but this means you’ll need to create the query yourself and you will also need to check your parameters against sql injection because you don’t want somebody to delete the DEX database through a tiny user input you provide (the library did the protection for free;) )

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>