viernes 30 de enero de 2009

Classification of languages

As far as I know, languages can be developed/invented for a purpose, or to be naturally evolved. If a language is developed by people to do a task, with a consistent set of rules, we call it formal language. However, if a language is acquired naturally, it evolves continously, and have some ambiguous/context dependent rules we call it natural language.

Regular languages classification

Languages are generated by grammar rules. Noah Chomsky proposed a formal Grammar hierarchy, which is related with Automatas classification.

  • Type 0: All formal grammars - turing machine

  • Type 1: Context sensitive grammars - linear bounded automaton

  • Type 2: Context free grammar - pushdown automaton

  • Type 3: Regular Grammar - finite state automaton



(See wikipedia)


Natural languages


As far as I know, natural languages doesn't fit in Chomsky hierarchy. But I'm sure that if there is a automaton that can handle a natural language, it must be as powerful as a turing machine. However, there is a lot of special semantic ambiguity that must be resolved by the surrounding context, or must be learned previously (more info).

Remaining cuestions

  • Does a theoy that understand the interaction between natural and formal languages exists?

  • Esperanto is defined as a constructed language. It has the same meaning that formal language?

martes 16 de diciembre de 2008

Parsing Grammar free context with nltk_lite

Before writing about Grammar transformers, I was searching a way to parse Regular and Context Free Grammars. Regular Grammars are very easy to parse in modern programming languages with Regular Expressions.

However, CFG aren't that easy to parse. We need a library to do that. Surfing the web, I found python for linguist , which is a tutorial abouy python and nltk library.

Following tutorial advice, we only need 3 steps:



  1. Create non-terminals instances


  2. Make individual productions


  3. Make the grammar
    object




and with grammar object, you can parse sencentes with several parsing schemes.

Well, I hope RG and CFG are powerful enough for gtol initial purposes.

sábado 24 de mayo de 2008

Adelin Backup

In my last post, I had some words about backup software and BoxBackup. This time I'm going to talk about Adelin Backup.

Adelin Backup is a GUI for BoxBackup which I'm curently developing for my company, Adelin Software (web coming soon). Like others backup GUI, it has the usual feature set:
  • Allow user to configure data paths
  • Backup method selection and parameters configuration
  • Restore data!
  • See what's happening
In order to archieve those features, we should be aware of boxbackup communications methods:
  • To restore files, we need to talk directly with server. Therefore, encrypted sockets are required
  • To modify configuration, we need to write to config file.
  • To send commands to boxbackup client, we have two methods. We can use local socket (unix socket or named pipe), or using system tools (kill, ps )
  • To reveive status information, we also have two methods, local socket and log reading
Adelin Backup has two components:
  • Controller application: Checks boxbackup execution status
  • User Interface: Restores files and check configuration. It also communicates with boxbackup.
I will talk about those interfaces and components soon, so please keep reading.

jueves 1 de mayo de 2008

Developing a GUI for boxbackup

Do you need to backup your data? There are a lot of backup open source applications, which are based on different approaches:
  • Tape or Disk: Tape based backup tools handles large data sets. They copy every data each time. Disk based tools tends to use hard links and other filesystem features to reduce data transfer and backup size.
  • Remote or Local: Local backups are done inside an intranet. Avoiding internet means faster connection and less security concerns. However, an external place is safer against a theft or a flood.
  • Encrypted or plain: If you want to place your backups outside your intranet, you will need to take some measures to keep your data private. There are two encryption approaches: Adding an encrypted layer to connection, or make encryption on the client side.
Boxbackup is an open source backup application, disk based with encryption on the client side and internet servers. It has some killer features, like sending incremental changes instead of whole file. But, in my opinion, the best thing boxbackup has is that source code is available. And that's an important thing when a program handles your private data.

But boxbackup have some drawbacks. Regular user doesn't like command interface. Even worse, if you lose your data, you will really appreciate a "panic button" to return to safe state.

That is the main motivation to develop a GUI for boxbackup. Allow users to do their backup easily. On the next post, I will talk about Boxi, a wxWidgets GUI for boxbackup, and Adelin Backup, a Qt4 based interface that I'm developing for my company.

domingo 27 de abril de 2008

Welcome to this place: an unusual Description

Hi everyone. I'm currently working on a C.S. related job at Madrid.

Wow! One random person create a random blog, and says some random things. From a certain point of view, that's booring. In fact, it is.

But it's not my intention to annoy anyone! So I'm going to share the knowledge that I acquire at work, or maybe some ideas that can be interesting. I'm sure I will talk about UML, data visualization tools like graphviz, funcional programming, OO and many other topics.

Welcome :)