Centralized Ledger System to Manage Employee Personal Information

By Prasad M.

Wipro Tech Blogs
8 min readOct 28, 2021

Employees receive lots of documents during their stay within an organization. These documents reside in multiple systems and the ownership of these systems resides with various teams within an organization. Since the data and the documents are spread across various applications, one must access multiple applications to access consolidated employee data and documents. Along with this, Enterprise applications often must track application usage data and maintain a complete history of audit trail data over time. With the current process in place, data traceability is becoming a pain point as the employee would access multiple systems to access critical data and download documents. Enabling a consolidated view of all the documents in a system with traceability for the employee is the need of the hour as it brings the ability for an organization to view and verify the person who has accessed the critical data.

Let's consider a large enterprise, with over 100K employees whose primary business is service delivery. The organizations contain various departments that manage all employee data and relevant documents. We will consider two departments of our interest, Hiring, and Travel which handles visa applications, profiles, and letters. Assuming both the application systems hold around 500 GB of data and expected growth in the next 10 years will be 1 TB. Managers or supervisors on regular basis access employee data and their documents. The audit trail data of who’s doing what on the documents on average will be two hundred employees per day.

The audit data that is stored must ensure secure, immutable, auditable, and tamper-proof.

Let us identify the available options to address this problem statement.

Solution 1: Traditional Database

To ensure the immutability of data in databases, organizations employ various mechanisms ranging from access controls, restricted access to data, reconciliation, audit trails, and firewalls. Despite employing these means, data stored in the database are mutable (i.e. not immutable) and can be altered once someone gains access to the data.

Some of the limitations of using Traditional Database as a solution has below shortcomings

  • No built-in metadata and data history
  • No built-in immutability and cryptography

This is where blockchain technology makes the biggest impact.

Solution 2: Decentralized Blockchain Network

Blockchain, as we understand it, is a combination of decentralized and distributed databases containing a registry of transactions that are distributed among peer nodes or participants in the network. The database registry includes an extensive list of transactions and is constantly updated with newer transactions as they happen. Starting from the initial transaction, a bunch of transactions is grouped into a block as per predefined block size. Once the block is created, the next set of transactions forms another block which is then linked to the previous block previously. Over time, a series of blocks is formed where each block is connected to another block like a chain.

The above two solutions have their advantage and disadvantage considering the problem statement. When we dig deep into the blockchain space, it again boils down to two important factors.

1) Centralized Ledger from a trusted provider

2) Decentralized Ledger

Problems around the decentralized network: Why decentralized blockchain frameworks are not needed for the problem statement?

Blockchain frameworks, such as Hyperledger Fabric and Ethereum, can also be used as a ledger for the use case. However, this adds more complexity as we need to set up an entire blockchain network with multiple nodes, manage its infrastructure, and require the peer nodes to validate each transaction before it can be approved by the network and added to the ledger.

The Distributed Ledger Solution is not recommended where there is no issue in the lack of trust that too within an enterprise organization. The blockchain is a trustless environment where nobody trusts each other, and there is no central authority to ensure trust between the transacting parties. But for an enterprise organization and dealing with its data trust is not a problem. For example, employee documents and their associated metadata will be accessed by multiple people within an organization, and to keep track of these transactions we don’t necessarily need a distributed ledger.

The way forward — Architectural decision

To overcome the burden of creating and managing the entire blockchain network infrastructure and still use the subset of blockchain features such as data immutability and verifiability, a centralized ledger database solution is the way forward. During decision making the data volume and expected data growth are also considered.

Centralized Ledger Database

Amazon Quantum Ledger Database (QLDB) is a fully managed ledger database that provides a transparent, immutable, and cryptographically verifiable transaction log. QLDB is a fully managed cloud-based ledger database owned by a central trusted authority.

You can use Amazon QLDB to track all application data changes and maintain a complete and verifiable history of changes over time. Persistence from application to QLDB is done through QLDB Java driver (Application choice), which works with SQL-like query language called PartiQL. Amazon QLDB uses cryptography to create a concise summary of the data change. QLDB can provide Audit trails out of the box without any further implementation.

Proposed Solution and Approach

To build a solution that enables various stakeholders to access employee records and documents using a unified portal. This would be an enhancement of the current process where stakeholders are required to connect with multiple other systems to view employee records. To be able to record an immutable audit trail of all activities that are performed against an employee, such as view and downloading employee documents and accessing critical data such as personal information from the unified portal. The system will provide an audit trail of activities done by users of the document, with the help of Amazon’s QLDB database.

Architecture decision

  • The solution will be built using Amazon QLDB and will be used to record the metadata of employee documents and user activity.
  • Common schema for data exchange between multiple systems that are lightweight i.e. JSON.
  • Active Directory for user management. Users should be identified only from within the organization and able to sign on using SSO.
  • Application secrets management with cryptography features.
  • RESTful APIs — create an interface between various systems via HTTP protocol that enables flexibility for communication.

Technical Architecture

As part of the application design and implementation, the solution would enable stakeholders and employees to view and download visa application data, qualifications, and work experience documents from the unified portal. The QLDB based solution would be integrated with the back-end system.

The proposed solution would enable the admin user to see the consolidated view of all the historical events about an employee from the UI. Below is a glimpse of how the audit trail display screen would look like in the user interface.

QLDB Technical Insights

Centralized ledger solutions are very new to the technology landscape and it requires some time and effort for beginners to get started with the SDK APIs and the Queries. Amazon QLDB site has some nice documentation around the concepts and the API.

Below is a quick look at the SQL-like query syntax to use in the qldb query console and java code snippet to establish the connection using SDK. You can find the same stuff in the Amazon developer guide and GitHub code repositories.

Ref: https://docs.aws.amazon.com/qldb/latest/developerguide/getting-started.java.html

Ref: https://github.com/aws-samples/amazon-qldb-dmv-sample-java

Amazon QLDB performance information is documented in a very minimalistic way. More details can be found in the URL https://aws.amazon.com/qldb/faqs/

SQL-compatible query style

The below SQL-like operations help the user to load data using the qldb query console. Since Amazon QLDB is schema-less, it is effortless to modify your data documents.

Amazon QLDB also provides SDK that we can use to connect to the ledger. We have used the Amazon QLDB driver for Java. Below is the quick code snippet to create and execute a transaction using amazon-qldb-driver-java.

Application Data Overview

In the figure below, you would see every update to the employee record in the User View triggers an equivalent revision update and the history is maintained by QLDB itself as part of its History View. The Yellow highlighted version number indicates the revised version and the changes that are made as part of that revision. It also contains transaction metadata.

Technical Challenges

We have come across a few technical challenges during the implementation and below are some of them.

  • Issues with pagination/ordering/limit

Although QLDB is a good fit as a system-of-record that is immutable, the support for querying in QLDB is limited as it does not support pagination/ordering/limit. Pagination and Sorting become impossible, and a custom solution will need to be developed to address this issue. We have had to develop in-memory cache storage in the application layer and store recently accessed data.

  • Dependencies

As per Amazon, QLDB is mainly focused to address online transaction processing workloads. We need to evaluate whether all types of business queries fit beforehand. Although QLDB extensively provides Streams that the data can safely be offloaded to a parallel database for these kinds of activities. However, this limits the usage only within AWS offerings.

Refer: https://docs.aws.amazon.com/qldb/latest/developerguide/limits.html.

Final Thoughts

Centralized ledger solutions are a great fit as a system-of-records which is immutable and when there is no issue with the lack of trust. We cannot certainly draw a line to find the difference between Amazon QLDB vs Blockchain, as they differ in their fundamental approach and concepts. Most organizations use cloud providers and centralized ledger solutions coming from such providers are known to be trusted. QLDB offers most of the subset features of blockchain cryptography. The use of Amazon’s QLDB database ensures cryptographically secure and tamper-proof data with an in-built audit trail.

--

--