Thursday 27 July 2023

Hyper-glossary nearing completion (?)

My next book will be a 'hyper-glossary' of terms relating to information security, including closely related aspects such as information risk management, governance, compliance ... and more ... and there's the rub: I'm struggling to catch up/keep up with developments in the field, not least because of the rate at which novel concepts are introduced and new terms are coined.

Here's an example of a definition originally added a couple of years ago and most recently amended today:

There I've defined "Deep fake", one of several terms washed up in the AI tsunami. The underlined terms are hyperlinked to their definitions ... and so on forming an extensive web within the document.

Researching and compiling/wordsmithing each entry takes me 'a while', then adding the anchors and hyperlinks to and from the new entry takes 'a while longer', especially when I get sidetracked while exploring other definitions - including other similar terms found online in various published glossaries. 

A new ISO glossary on AI opened a rich seam of semantics and prompted a shed-load of research and writing this week. I'm reluctant to even estimate the hours 'invested' (well, OK, spent or sunk). It's a good thing I enjoy this task!

You'll see from that example that I expand upon the implications, rather than merely defining things - for terms that catch my fertile imagination anyway. Reference sources such as security standards, methods and laws are more succinct/curt, to the point of being distinctly cryptic and unhelpful for the poor reader in some cases. It doesn't help that they are mostly the products of erudite committees, often international with heavy academic involvement. Precision and accuracy often trump readability.

The converse also applies sometimes when my definition is shorter than the 'official' reference. Here's the definition for the word "Deduplication", for instance, distinguishing my plain English version in black text from a more involved definitive version quoted from the standard in red italics:

I have conscously and deliberately referred to 'information' rather than 'data', since the principle of deduplication applies to information in all formats, not just digital data and computer systems - hinting at a concern with many of the references in this field: the IT context is implicit or explicit, particularly in the "cybersecurity" ones, most of which further constrain themselves to dealing with active attacks perpetrated through the Internet, completely ignoring vast swathes of the extensive threat landscape that lays before us. Notice the "Deep fake" example above mentions "systems" and "technology" but not specifically IT or data, while the long list of significant implications are mostly social/societal in nature: IT is merely a handy tool for deep fakery. It may not qualify as a cyber issue but it most definitely is pertinent to information security.

The book has over 500 pages and just short of 150,000 words, with an indeterminate number of hours remaining on proofreading and correcting things, while trying not to get sidetracked by yet more new terms of art (maybe that should read "terms o' fart"!). Last week I made the executive decision on behalf of the reader to rationalise the numerous abbreviations more consistently, with entries expanding on the abbreviations hyperlinked to separate entries for the definitions. I just checked: there are presently 3,516 rows in the table today, meaning roughly 3,100 terms plus 400 abbreviations so far although I keep tripping over more.

And that's another thing: Amazon's system refuses to convert the book from MS Word to Kindle format unless I split the giant table. It doesn't say how many splits I need, and I'm wary of breaking those internal hyperlinks that I've patiently introduced, wrecking the internal web that makes it so easy to browse. MS Word is hanging in there and can even generate PDF output, so now I'm thinking about how best to publish it as an electronic document. Maybe an actual book publisher will have the skills to make it work technically and commercially - which would be nice.

Meanwhile, as a taster for what's to come, I have self-published an older, shorter version of the hyper-glossary as a simple HTML page without any text formatting except for the underlined hyperlinks, through our ISO27001security website. By all means compare my original (2021) online definition of "Deep fake" with the latest version above. Did I improve it? The jury's out!

The hyper-glossary will soon be 'done' - or rather, I'll have had more than enough to ship it and take a break before commencing the next round of updates. It's not unlike painting the Forth bridge, worse in fact since they discovered modern long-life paint a decade ago. Lucky buggers. 




   

  

No comments:

Post a Comment

The floor is yours ...