Posts Tagged ‘lexiio’

A more relational dictionary

As I started looking to add more functionality to Lexiio, I realized the Wiktionary definitions database dump I was using wasn’t going to cut it; specifically, I needed a normalized schema, or I’d have data duplication all over the place. I started normalizing in MySQL, but whether it was MySQL or MySQL Workbench, I kept running into character encoding issues. Using a simple INSERT-SELECT, in MySQL 5.7, to transfer words from the existing table to a new table resulted losing characters:

MySQL losing characters

I dumped the data into PostgreSQL, didn’t encounter the issue, and just kept working from there.

The normalized schema can be downloaded here: LexiioDB normalized
(released under the Creative Commons Attribution-ShareAlike License)

LexiioDB schema

The unknown_words and unknown_to_similar_words tables is specific to Lexiio and serve as a place to store unknown words entered by the user and close/similar matches to known words (via the Levenshtein distance).

Lexiio

Another little experiment of mine: Lexiio, a web-based CLI dictionary.

Lexiio

A few takeaways:

  • Part of the reason for building this was that I wanted to actually make use of the Wiktionary data set snapshot in a real project. The data set is pretty comprehensive, and easy to parse and work with.
  • This was also a learning exercise for Golang. There’s nothing complex here but, so far, working with Go has been enjoyable. I like that I’m building a native application, types are enforced, and the HTTP server included as part of the standard library is incredibly easy to setup and work with.
  • I wanted to experiment a bit with what a web-based CLI would look and feel like. For something like a dictionary, where user interaction revolves around textual input/output, a command-line interface seems to work really well.