Lightweight High Performance Reference Database

The technique described in this lesson is useful for reference databases with low change frequency and large number of documents.

I've created a Lightweight Directory on Notes.The technique used to build this database is useful for other reference databases with low change frequency and large number of records. Example: in the Netherlands a list of valid ZIP-codes could be stored this way. I've chosen the Directory on Notes because it best proves that the concept works.

The IBM Directory on Notes is not a successful application. This application has two major problems.

The first problem for the user is that the DoN does open very slow especially if the user opens the database for the first time. The administration of the unread marks causes this problem. For each database Notes keeps track of the (un)read documents even if in non of the views the unread marks are shown. IBM, Lotus and Iris have been working about this problem. One of the results of this effort can be seen by the user in the d_dir where next to the DoN the 'DoN Unread lists' database is listed.

The second problem is the replication of the database. The replication from an enterprise server to a workstations takes many (really many) hours. The initial replication setup takes 6 to 10 hours and the replication itself takes another reasonable amount of hours. During the Notes roll out the database was loaded using FTP and not by Notes replication. A result of this was that OS/2 staging servers with the DoN were not possible/allowed.
From 23 -25 February 1998 I was at the Lotus Technical Forum in Boston. One of the last presentations was about the new directory (old name and address book). A little demo was given. The directory contained about a 100K persons, but the database property "Documents" showed that the database contained 600 documents. At the end of the presentation was asked if we, as developers, could use this feature in our own applications. The speaker couldn't give or didn't know the answer.


Triggered by the directory of Notes release 5, I wondered if I could create a lightweight version of the Directory on Notes for Notes release 4. The answer is yes. Here are the characteristics of the IBM DoN and my lightweight version of it.


    Characteristic
    IBM DoN
      Lightweight DoN
    Database size
    > 1Gb
      200 Mb
    Number of documents
    400K
      3K
    Number of fields
    20 Million
      400K
    Number of views
    many
      one, lookup on name only
    First open
    times out
      within 30 seconds (including loading the graphical navigator)
    Full replication
    > 10 hours
      < 1 hour
    Data
    All employee data
      Application field data only

Seeing the number of employees and number of documents in the database it is clear that information of multiple employees is stored in one document. The information of mulitiple employees is stored in one document as follows:

° a multi value field containing the employee names concatenated with the employee number to make each field value unique
° a field containing the links between the employee number and the field name in which the employee information is stored (the index)
° a field for each employee containing the employee information, the "fields" (name, phone, ...) within the field are seperated by one seperator character
° a field for the country code (only employees from the same country are stored in a document, this makes selective replication to workstation possible)

The following table gives an example of the contents of a document:



Field
      Value
key
      Schroder, J (Jasper)(788076109
      Schroder, T (Taro)(788076112
      Schroer, M (Marcel)(788076108
empindex
      7880076109%I0#788076112%I1#788076108%I2
I0

      Schroder, J (Jasper)#CN=Jasper Schroder/OU=Netherlands/O=IBM@IBMNL#=#788076109#N#067900####IAC3DU#1F1.20#UIT IO#Netherlands#####31-079-322-8069#-8069#788024222#P1N
I1

      Schroder, T (Taro)##AMPVM1.SCHRODER@VM#788076112#N#012500####IAC-1#IAC-1#IAC IO#Netherlands#####31-020-513-6795#-6795#788093280#P1N
I2

      Schroer, M (Marcel)##EAMSVM1.SFS1234@VM#788076108#N#100900####ZTM112#ZTM#NETH SFS#Netherlands#####31-079-3223037#322-3037#788061861#P1N
Country

      "788"
$anonymous
With the view setting 'Show multiple values as separate entries' turned on for the column showing the "key" field (name and number) you create a view with all the employees in it. As Notes doesn't tell you in the line which entry of the multiple value field is shown you are restricted to a one column view.
The user can not open a document from this view as the user can in other Notes applications, because the opened document contains the data of multiple employees and in the application you can not determine which employee is selected in the view. So I created two forms, one for to do a lookup for employee information and one for selecting names to be inserted in the addressee fields in a memo. See the Lightweight Directory on Notes for the code to look up information.


The lesson learned here is that IBM can create a Directory on Notes which:

° can reasonable fast accessed (works over the phone for me),
° replicates easily over the enterprise servers
° can be replicated to a local workstation (Netherlands directory is 3.5Mb and replicates within 5 minutes)

Notes:

° The construction used is useful for other large reference databases as well
° The maintenance (insertions/deletions) of the data is more complicated then in other applications. You can no longer just delete a record by deleting a document in the database. Instead you have to process the index or rebuild the document will all the records within.
° IBMs strategical direction for the directory is DB2 and there is no intention to upgrade the DoN according this lesson learned
° Notes release 5 will deliver a new directory which can hold up to a million entries. At this moment it is not clear if this new directory can and will replace the DoN.
° For more information about lightweight reference databases see my subscription in the Best Practices Conference database (the document in this database describes the solution in more detail and contains the code to build a implified NAB this way). The Best Proactices database can be found on server D01DBR01 in the n-dir.
° For the Freelance presentation
about the lightweight see the Team.Connect website (link to Team.Connect from w3.coc.ibm.com, goto pizza sessies, select presentations, the Don presentation is "Notes Best Practices/Performance" of 16/04/1998

The April version of the Lightweight DoN is still available on the IBMNLP1 (9.132.37.141) in the 'test' directory.