[The Library Dev Blog] Flat Text Files...

odinthelibrarian (54)in #dev • 8 years ago

[The Library Dev Blog] Flat Text Files...

Have the newest update to Nodez up on my GitHub!

Introduction to Nodez Analysis Software

Spoiler alert: it's basically nothing.

Most of the work has been in the design of the software. I've run into really noobish walls when it comes to file access issues, but I'll get into that later.

The Design

One of the biggest design decisions I made was going with a flat text file schema instead of using databases (MongoDB being my initial choice). My decision for going with a clunkier, more resource and code intensive solution like flat text files for Nodez is simple. I want to keep the software stack simple. This tool isn't designed for people who know about databasing, or people who can handle a larger software stack. I don't want a ton of dependencies, I don't want the end user to have to deal with technical headaches. My goal is to make a simplistic, easy-to-use analysis software that is built for a borderline tech-illiterate end user that has analysis work to do. Think a small-town detective or private practice PI.

So basically, all of the information will be stored in an accessible and hopefully readable flat text file. Yes, this means processing time may increase with bigger projects with more nodes and connections. Yes, this may mean an increase in filesize in larger projects. I think in the end, this will lead to a more user friendly installation and use, as long as the Journalist is using a laptop with more than a 15G hard drive and more than 2MB of RAM... which is always possible.

Basically, the Node is the base form of any kind of data you want to represent. Take for example, a Chinese Threat Actor XYZ and an American Tech Firm ABC. XYZ and ABC are both nodes. An attack is a possible Connection between the two, denoted CON. ABC and XYZ have a name, ie the name of the threat actor XYZ and the name of the company ABC. With a flat text file design, the nodes will be stored by name along with every node they're connected to. So it would look something like this (design pending):
#XYZ (name)

Threat actor (description)

~ABC (connection)

#ABC

American tech company

~XYZ

I did some very rough sketches of possible User Interface (UI) designs. They're too ugly to be posted here, but are rough sketches on how I want it to look in the end.

I Hate IO

As stated before, interacting with flat text files can be very code intensive, which is tech-speak for annoying as HELL. Logical race conditions seem to be my biggest problem, as with the current design I am reading from and writing to the text file at the same time. These aren't like the race conditions that come with multi-threading, but are purely logical, meaning that the problem is in my design and not necessarily the technical implementation. Luckily, this means I should be able to remediate these issues ASAP.

Take a Look!

The code is on my GitHub. Check it out, follow along, comment with issues or questions, and read over what I've got. It's all in Python, so it's pretty damn readable.

--------------------------------------

Like the post? I run this threat intelligence blog on Steemit and offer the content free of charge. If you're a Steemit user, you know that upvoting, which you do for free, magically puts a couple cents in my pocket. Maybe I'll buy a pack of gum with last week's earnings, but it all depends on your help. Not a Steemit user? My biggest metric of success is my viewership. If I don't make a cent but my content reaches a wide audience, that means my product is valuable and my efforts are worthwhile. Therefore, give me a share on your social media of choice, follow me on Steemit for more threat intel posts, and follow me on Twitter to see stupid memes and get updates when I post.

#steemit #osint #thelibrary #nodez