lanky Girl

Archive for the ‘DITA’ Category

3.11 Resources

In DITA on January 3, 2010 at 4:06 pm


JavaScript URL:


Macfarlane, A (2009). Lecture Notes, INM348 Digital Information Technologies and Architectures, Session 01 – Blog, page 1-3, City University, London. (accessed 8th October 2009) (accessed 14th October 2009)


Butterworth, B (2009). Lecture Notes, INM348 Digital Information Technologies and Architectures, Lecture 02: Digital Representation and Organisation, Meta-Data/Markup, City University, London. (accessed 15th October 2009)

(accessed 15th October 2009)

blog 3 (accessed 26th October 2009) (accessed 27th October 2009) (accessed 26th October 2009)

(accessed 27th October 2009)

blog 4

Butterworth, B (2009). Lecture Notes, INM348 Digital Information Technologies and Architectures, Session 04 – Graphical Information, City University, London (accessed 7th December 2009) ( accessed 7th December 2009) ( accessed 7th December 2009)

blog 5 (6th November 2009) (9th November 2009) (9th November 2009)

blog6 (accessed 14th November 2009) (accessed 14th November 2009) (accessed 15th November 2009)


Codd, E.F. (1970).”A Relational Model of Data for Large Shared Data Banks”. In: Communications of the ACM 13 (6): 377–387. ( accessed 11th December 2009) (accessed 11th December 2009)

blog 8

Manning,C.D, Raghavan .P, Schütze.H (2009) An Introduction to Information Retrieval, (Online edition) Cambridge UP (accessed 21st December 2009) (accessed 21st December 2009) ( accessed 22nd December 2009)

blog 9 (accessed 27th December 2009) (accessed 29th December 2009) (accessed 30th December 2009)

blog 10 ( accessed 1st December 2009)

Rosenfeld, L. and Morville, P. (2007), Information Architecture for the World Wide Web (3rd Edition), Sebastopol, CA.:O’Reilly, page 4


3.10 Information Architectures

In DITA on January 3, 2010 at 3:56 pm

According to Morville and Rosenfeld (2007 P 4)

Information Architecture is a combination of many things:

“The combination of organization, labelling and navigation schemes within information systems”.

“The Structural design of an information space to facilitate task completion and intuitive access to content”.

“The art and science of structuring and classifying websites and intranets to help people find and manage information”.

An emerging discipline and community of practice focused on bringing principles of design and architecture to the digital landscape.

Information Architectures as defined in the context of Electronic Publishing relates to the way that information is organized in digital media.

Our coursework is an example of good information architecture. Content is user-generated and posts are chronologically organised by date. A tagging system allows you to link posts and blogs together on the same topic. Widgets can be installed to make the pages more dynamic. There are millions of blogs now on the web on a wide range of topics and this explosion is down its easy usability.  

 Information architecture also affects online journalism. Content for news often needs to be published quickly. Most online media have implemented a content management system to provide a spatial view of the front end and a database at the back-end. SQL queries are used retrieve information and it can be easily shared between departments.

HTML is used to organise the structure of the news stories and it also content generators to link to previous stories and helps the user navigate the site. In terms of design CSS allows content to be separated from separate style from the content allowing it be more accessible to blind people.

Images are important in electronic publishing to enhance content and make pages more dynamic. They also need to be in the right format in order for pages to render quickly.

JavaScript can add functionality to make the website easier to navigate and you can apply technologies that can manipulate the XML mark up.

3.9 Applications Development

In DITA on January 2, 2010 at 4:29 pm

This week has been challenging . We have been looking at client side applications and learning how to program a supposedly simple JavaScript web page. So what is JavaScript? JavaScript is a scripting language used to make web pages dynamic. It was designed to look a bit like Java but is completely different and is supposed to be easier for people who aren’t programmers to use. My problems with JavaScript is that at first it felt a bit like doing maths and anything to do with maths my brain shuts down. However I found a website that put it down in a simple way.

“Think of a programming language as similar to human languages in that they both have rules and syntax”.

1)     The words are like a set of terms that refer to what your program should work with ie your browser window or how they can be manipulated e.g opening the browser window.

2)    The way that the words can be put together to produce a desired effect is known as the languages syntax.

I found the jump from the simple lecture notes to writing a program to be very steep. In order to learn more about the different objects I turned to W3chools, then I used the Try IT exercises to practice.

The exercise that we were set was to produce a JavaScript program to elicit information from a user and provide a Web link to the appropriate section of the BBC Web Site according to the following criteria :

  • whether the user is interested in news or sport
  • whether ‘news users’ are in england, northern Ireland, Scotland or Wales
  • whether ‘sport users’ are interested in cycling, golf, football or tennis.

Here is a link to my JavaScript program

3.8 Information Retrieval

In DITA on January 1, 2010 at 6:40 pm

This week in DITA we have been introduced to the concept of Information retrieval. Information retrieval is a broad concept but essentially it deals with how we are able to access information usually in a text format from large collections stored on computers. Information retrieval is the opposite to the relational model and deals with unstructured data.

The  web has meant that there has been explosion in the amount information we now have stored. (Manning, Raghavan, Schütze, 2009)

During the lab exercise the first task was to do a Boolean search based on an emotional or ASK need. To do a Boolean search you pose a query which is in form of a Boolean expression using the operators ‘AND’, ‘NOT’ and OR. To test  this we used BING search engine and tried to use different search expressions to change the result of the search. I used  the ASK  to find “Places for afternoon tea outside London”.

First search: places for afternoon tea NOT London – results come back that will not feature London

Second Search: Places for afternoon AND tea NOT London-results that came back  featured tea in the afternoon and that werent in London.

Second half of the task was to create an inverted file.

Inverted files are used by search engines to speed up the process of retrieving information. In an index file you have a given term and then for each term a lexicon which is list that shows how many times that term occurs and number that it appears in the text.

Here is a link to a worked example.

3.7 Databases

In DITA on December 30, 2009 at 9:19 pm

This week in DITA we have been learning about Databases and the DBMS. The DBMS is used to access structured data efficiently. Previously access to data tended to be quite slow as data was stored on separate files on magnetic strips. There was inconsistency in the formats that each file was stored in and because there was program/data independence any small changes in the data meant the programs had to change.


The introduction of DBMS into companies allowed information to be centralised and easier to share and access amongst departments. Using a DBMS meant that the data was more secure as access was controlled to database administrator.

Relational model

The Relational Model of data is The purpose of the relational model is to provide a declarative method for specifying data and queries: we directly state what information the database contains and what information we want from it, and let the database management system software take care of describing data structures for storing the data and retrieval procedures for getting queries answered. (C0dd, 1970)

Below is a worked example:

Sample Table: empinfo
first last id age city state
John Jones 99980 45 Payson Arizona
Mary Jones 99982 25 Payson Arizona
Eric Edwards 88232 32 San Diego California
Mary Ann Edwards 88233 32 Phoenix Arizona
Ginger Howell 98002 42 Cottonwood Arizona
Sebastian Smith 92001 23 Gila Bend Arizona
Gus Gray 22322 35 Bagdad Arizona
Mary Ann May 32326 52 Tucson Arizona
Erica Williams 32327 60 Show Low Arizona
Leroy Brown 32380 22 Pinetop Arizona
Elroy Cleaver 32382 22 Globe Arizona


Display all columns for people under 25.


SELECT  * from empinfo

WHERE  age >25

3.6 Document Object Model (DOM) and Cascading Stylesheets (CSS)

In DITA on December 30, 2009 at 6:14 pm

This week in DITA we have been focusing on Cascading Style sheets (CSS) and the DOM Document Object Model.

According to W3C “The Document Object Model (DOM) is an application programming interface (API) for valid HTML and well-formed XML documents”. DOM is more of a concept and it stipulates that the documents should be structured hierarchically. It defines the logical structure of documents and the way a document is accessed and manipulated. CSS (cascading style sheets) control the presentation of a web page. CSS was designed so that the content of the document written in HTML could be separated from the style elements such as the layout, fonts and colours.

During our practical lab session we were asked to create a CSS file and use it to change of style of our web pages. Here you can see mine:

Advantages of CSS

  • Improvement in accessibility – when you separate the content from the style you make it easier for those who only want to view the content of the site e.g. visually impaired.
  • Flexibility- because CSS are in separate files it means that making changes to the layout are much more easily done.
  • Consistency – because one style sheet can be applied to many web pages it is easier to make sure that each page of the site looks the same.
  • The speed of which pages can be downloaded will be quicker as the browser cache can store CSS which can be used on multiple pages.

Disadvantages of CSS

  • There seems to be an inconsistency to which CSS is supported by browser.
  • CSS does not work in earlier versions of internet explorer and will only display plain HTML. Unfortunately there is evidence that there are still a few people who still use this browser.

3.5 XML -Extensible Markup Language

In DITA on December 29, 2009 at 7:48 pm

This week in DITA we have been focusing on XML and its relationships with HTML and XHTML.

Simply the difference between XML and HTML is that XML files describe the data and HTML holds the data. XML like HTML is a mark up language but it was created with the purpose to help retain the flexibility whilst reducing its complexity. XML generally came about because of the desire to make HMTL’s vocabulary more fixed.

In comparison with HTML the rules that govern XML files are strict. XML files should be well formed meaning that they conform to the XML syntax rules such as:

  •          All XML elements must have a closing tag. (In HTML this is not always necessary.)
  •          XML tags are case sensitive.
  •          All XML elements must be properly nested.
  •          All XML documents must have a root tag.
  •          Attribute values must always be quoted.

XML files should also be Valid which means which conforms to the rules of a Document Type Definition (DTD). When these rules are not followed they make the XML files unusable while in HTML if they were ignored it would be detrimental. XHTML is HTML reformulated in XML syntax.

One of the benefits of XML is that fact that it allows the content of the document to be separated from the semantic information. In the context of Electronic publishing where different platforms for accessing information ( PDA, PC, ) are used decisions made about presentation can be  left until the documents are delivered thereby aiding interoperability.

3.4 Images and Graphics

In DITA on October 25, 2009 at 11:52 pm

This week in DITA we have been looking at Graphical image formats.

Raster images

Raster images work well for complex graphic such as scenery as they record continuous information. Cells are usually arranged in a grid and each cell has a numeric value that represents its content. These cells are often referred to as picture element or pixels for short. (Butterworth,2009)


GIFs were developed by Compuserve in 1987.  GIF Files support 256 colours as they are in a 8-bit format meaning they record 8-bits of information for each of the pixels. Because of the limited nature of GIF files they are better to use for logos or line drawings. GIFs can also be animated unlike JPEGs which are static images. ((Butterworth,2009)

JPEG’s were developed by the by the Joint Photographic Experts Group. JPEGs are able to support 16 million colours because of iTs 14-bit format. Compression is more successful in JPEGs than in GIFs as JPEGs employ a technique called lossy where the colours are subtly modified by creating patterns. The loss in colour data is not normally detectable to the human eye which means that JPEGs are very good at storing photographic information.

PNGs like JPEGs are in 24-bit format but they compress images without the lossy technique. However the PNG files tend to be larger than JPEGs.

Image formats is important in web design because large files can affect the  time a page takes to render. Depending on the topic of your page you would have to decide whether to compromise on resolution or page rendering.

Vector images are different because they are made up geometric point, lines and curves which are based on mathematical equations.

Here is a link to an embedded formatted image that is blocky because of the pixels.

3.3 The Internet and World Wide Web

In DITA on October 18, 2009 at 10:46 pm

This week in DITA we have been looking at the Internet and World Wide Web. Before this week’s lecture I had always thought that the Internet and the web was the same thing. The confusion seems to come from the fact that we use the terms interchangeably.

The Internet

The Internet essentially is the structure that holds the web. The Internet started life as ARPANET and was first created in 1958 by The Advanced Research Projects Agency (ARPA) to link all the radar systems of the US together. In 1969 the University of California in Los Angeles (UCLA) got hold of ARPANET and since then strides in technology have culminated in the Internet we have today. It is a massive network of networks allowing each connected computer to send information to each other using protocols. Other subnetworks also run on the internet for example email uses SMTP. The diagram below shows others.


The World Wide Web

The World Wide Web was invented in 1989 by physicist Sir Tim Berners Lee. It runs on top of the Internet. The web is made up of large-capacity computers known as web servers which are connected to the Internet through telephone and satellite. Web servers use HTTP protocol to allow computers that are connected to any web server to access files across the web. Files are accessed across the web via web addresses otherwise known as Uniform Resource locator (URL). A URL is made up of the protocol HTTP, the domain name – name of computer hosting the file and the path to the file itself.

In my last post I was introduced to the concept of including Metadata. HTML or Hypertext Markup Language is a form of markup language designed by Tim Berners Lee where HTML tags are used to indicate to the web browser how the web page should be structured.

In this week’s lab we were required to design and link web pages hosted on our City web space.

Here are the web pages I created in this weeks session.

2nd Week in Dita (3.2 text/Html)

In DITA on October 11, 2009 at 8:18 pm

This week In DITA we have been looking at the representation and storage of data. As humans we are used to our counting system being based on base 10 (Butterworth,2009) with computers this is not the case, computers count in base 2 known as binary. Binary code is a series of zeros or ones which are supposed to be the different states a computer can have ON or OFF.  For example the number 157 would be 10011101 in binary. To be honest I struggled with the concept  of binary at first but i found a very helpful tutorial here. A single zero is known as a bit and a sequence of bits equals a byte. Bytes put together form files.

So how does all these bits become text? The answer is ASCII.  ASCII or the American standard code for information interchange is a system of computer code in which seven bit sequence can be encoded as 128 characters. For e.g 100 0010 equals the uppercase letter B in ASCII.

As the ASCII character set is so basic, it is more commonly used for writing source code for Windows control files such as Config.sys, and Win.ini.  It also used to for transfer data among applications that do not share a common file format. For example files that are edited in Notepad and saved with the .TXT extension can later be opened by word processors such as MS Word.

During our lab session we were asked to open a Microsoft Word document with a .DOC extension in Notepad. When we did this we were able to see that Microsoft documents contain Metadata. Metadata is essentially data about data it is used to describe, locate  and retrieve information more easily. In a typical Microsoft word document metadata can include the name of your computer, the names of previous document authors and template information. A way of including Metadata is through Markup. I will go in more detail about Markup in my next blog when talking about Html.

blog 3

blog 4

blog 5



Codd, E.F. (1970).”A Relational Model of Data for Large Shared Data Banks”. In: Communications of the ACM 13 (6): 377–387.

blog 8 (page 38) (Manning, Raghavan, Schütze, 2009)

blog 9

blog 10

Rosenfeld, L. and Morville, P. (2007), Information Architecture for the World Wide Web (3rd Edition), Sebastopol, CA.:O’Reilly, page 4