CV 3.0 Content Access Service —Architecture and Proof of Concept Implementation Specification
CV3.0 is a research project of the Bern University of Applied Sciences (BUAS) to explore a new way of presenting educational achievements and scientific activities in electronical form and online. It enhances the data of a traditional curriculum vitae with additional information and introduces a platform for accessing such a CV3.0 curriculum vitae. This document gives an overview of the platform architecture and specifies the proof of concept (PoC) implementation created at BUAS.
Part I. - Architecture overview
In a CV3.0-environment, multiple systems and components interact to build and provide an online curriculum vitae. This first part describes these systems and their interaction in an abstract manner; a specific implementation is given in the second part which describes the proof of concept implementation at BUAS .
Figure 1 depicts the CV3.0-environment and the involved systems and actors.
In the following sections, all parts will be described.
The inquiring party defines the actor who requests CV-data from the CV3.0-environment, it can be either human or an application/service. Different classes of inquiring parties stand for different levels of trust and differ in restrictions while accessing a CV3.0. The person which is associated with the CV itself by definition has more rights than an employment agency or even an anonymous robot crawling the web. The proposed system must consider these different kinds of inquiring parties and act accordingly.
The full content of a CV3.0, referred to as CV3.0 or CV3.0-content throughout this document, contains the additional CV3.0-data as well as all the data coming from the organisational facilities depending on the access level of the specific inquiring party requesting it.
CV3.0 Content Access Service (cv3cas)
At the front, the CV3.0 Content Access Service (cv3cas) provides access to CV3.0-content for inquiring parties. It serves as a proxy for accessing data from the attribute- and document-facilities of an organisation (see section Facilities), as well as for accessing additional, CV3.0-specific content.
This functionality is implemented in 4 core components which are described in the following.
The front-end service provides access for inquiring parties to CV3.0-content. Content is accessed on a per-CV-basis: The common denominator is a CV which is related to a single person. The CV bundles all the data coming from the organisational facilities as well as the additional CV3.0-content. Depending on the trust-level of the inquiring party, more or less content may be retrieved, see section Trust.
Depending on the inquiring party (human, owner, agent, ...), specific representations of the content are returned.
The core application logic represents the glue between the different components of cv3cas and interacts with them.
CV3.0 introduces additional attributes and metadata on top of the data received from the organisational facilities. The whole set of data including everything related to a specific CV retrieved from any facility as well as the CV3.0-content itself is stored internally in the CV3.0-storage in RDF-format .
The facility connector is responsible for the connection to the organisational facilities for documents and attributes. It provides a plugable interface to access various kinds of backend-systems, like for instance relational database systems or fileshares.
Besides the technical implementation details of cv3cas, some other important aspects surrounding security and data protection must be carefully considered in the design and the implementation.
Security, Trust and Privacy
Dealing with potentially sensitive information, special attention must be given to issues of security, trust and privacy.
On a first level, communication security and confidentiality must be ensured. This is achieved by using well-defined and widely accepted standards such as TLS . All communication with inquiring parties on the front side of cv3cas must be secured according to best current practises. In communication with organisational facilities, the securest method offered by those is used where applicable.
For privacy in terms of personal data protection, different access levels depending on the inquiring party must be implemented. Consider the following examples for clarification on this:
- An anonymous inquiring party can only enumerate the available CVs, however, not request their content
- Retrieval of basic CV-data like academic degrees for inquiring parties with basic trust (e.g. employment agencies)
- Retrieval of full CV-data for inquiring parties specifically allowed by the owner of the CV and by the owner itself
The content of a CV is used by the inquiring party in comparing persons, for instance when applying for an employment. Thus, the contents of a CV must be correct and this correctness has to be proven by the issuing organisation (or all involved organisations in a distributed case).
Digital signatures are now standard in assuring such proof and are also to be used within CV 3.0. Two cases have to be differentiated:
- Signatures issued by cv3cas itself
- Signatures on data received from the organisational facilities described in the next section
In the first case, cv3cas can issue the signature itself and guarantee it's correctness; in the second case, the signature from an organisational facility will be relayed as-is and a statement can be made that the signature has been obtained at a certain point in time.
Not all the data in a CV3.0 is generated by the cv3cas itself. Additional data is coming from surrounding systems within the organisation. Two types of data can be distinguished:
- Data in the form of documents, examples would be a master thesis or other papers written by the person, photos etc.
- Data in the form of attributes, like a person's email-address or birth date.
Systems, which are queried by cv3cas for such data, are called organisational facilities, as they belong to and are run by the organisation which hosts the cv3cas. It is possible for an organisation to have more than one facility of every kind. The two types of organisational facilities will now be described in more depth.
Organisational Document Facility (ODF)
The organisational document facility, in short ODF, is a facility which stores documents (or more exactly: files), which are related to the CV. This could include photos or other material which accompanies the CV of the person, however, the primary use for this facility is the storage and retrieval of certificates and other forms of confirmations for achievements at an organisation.
Organisational Attribute Facility (OAF)
An organisational attribute facility, in short OAF, is a facility which stores and provides attributes about a person within an organisation. Typically, such attributes are used in the domain of identity management  and federated identity management .
Examples for organisational attribute facilities would be an LDAP- or any other directory service , an identity federation, like SWITCHaai  or other internal systems, like for instance systems used by human resources departments.
Part II. - Proof of Concept Implementation
While the first part of this document described the involved systems, components, actors and their relation to each other, the second part now focuses on the proof of concept implementation built during the CV3.0 research project at BUAS. It will describe the technologies used for building and interconnecting the individual parts, as well as the data transmitted between them and included in a final, PoC CV3.0.
Note: the technical overview given in this part is meant explicitly as an overview of the technologies and methods used for the PoC-implementation, as well as - for certain specific areas - as a research summary. It is not an in-depth technical specification at the implementation level. Further documentation will be included with the built PoC.
Organisational Facilities involved in PoC
In the proof of concept, a first (as the name implies) PoC-grade implementation of a cv3cas-service will be implemented and connected to organisational facilities at the BUAS.
SQL-Database as OAF
At BUAS, currently two possibly interesting attribute facilities have been identified:
- IS-Academia (IS-A), a student directory containing all administrative data related to a student
- myIDP, a claim-assertion-service for the SuisseID , developed at BUAS
As both attribute facilities rely on SQL databases as backend for storing the attributes, a facility connector for SQL-databases will be built during the PoC.
Simple ODF in a git-RCS
In a preliminary analysis, the need for a document management system capable of issuing digitally signed documents has been identified. Further research amongst existing DMS-solutions showed that these either don't offer the required signature feature in a form usable by the project or are too complex to install, configure and maintain for the PoC.
It has thus been decided to implement a simplified DMS-system using technology well known to the research group and use that as an exemplary DMS.
For versioning and integrity control in the simulated DMS, the widely deployed revision control system git  will be used. git works as key-value-store where any kind of data which is added to it can be referenced by its SHA-1-hash . As example, figure 2 depicts a revision of the PDF-version of this paper in its git-repository .
commit 03f340691bb90870f19fda45678fd824c93ed53f Author: Pascal Mainini << email@example.com >> Date: Fri Apr 12 16:45:21 2013 +0200 initially added
An additional layer of trust could be added by using an X.509 trusted timestamping-service for each commit to the ODF, similar to this approach . However, it has been decided that digital signature of the documents in the facility is out-of-scope and should be done by the issuer of a document and within the document itself.
cv3cas PoC Implementation
As part of the PoC, a plugin-architecture for facility connectors will be implemented. This architecture permits the use of interchangeable plugins to connect to various types of attribute/document facilities. Plugins have to provide an abstract interface to the respective type of facility and are always implemented in a read-only manner. Two types of operations on facilities have to be defined:
Enumerate Retrieve a list of attributes/documents belonging to a CV
Retrieve Obtain a specific attribute/document belonging to a CV
In both cases, identification of the CV is done using the unique identifier defined in section Unique Identifier.
As back-end platform for the CV3.0-storage-component, Fuseki  of Apache's Jena-project has been choosen. Fuseki is an RDF-triplestore  which has already been deployed in other projects of the research group at BUAS and which is by far sufficient for the needs of the PoC.
RESTful Front-end Service
The front-end service will be implemented as a REST-based  software architecture. REST is built around the HTTP-Protocol  and is a client-/server-based architecture using the concept of resources and representation of these in "documents". State and modification of the resources is achieved using HTTP-methods. The same endpoint can be used by machines as well as humans.
A CV can be accessed over the REST-endpoint using the relative URI
/cv/<identifier> where identifier is the unique identifier of a CV. See section Unique Identifier for
details of the identifier.
The REST-endpoint of the cv3cas supports content negotiation , thus returning a suitable representation depending on what has been requested by the HTTP-agent. Table 1 lists the supported content-types.
||Human- and machine-readable RDF representation of CV in turtle |
||Machine-readable RDF-representation of CV in RDF+XML |
||Either JSON  or JSON-LD  representation of CV|
It is planned to build the HTML-interface for human interaction using uduvudu , a framework for building user interfaces for RDF-data in RDF, currently being developed at BUAS. If the implementation of uduvudu is not mature enough by the time it is needed within the CV3.0-project, a simplistic and hardcoded view on the data will be implemented.
The JSON-content-type is optional and will only be implemented if the other parts could have been completed in time. JSON can nowadays be considered as industry-standard for interaction between applications on the web and would thus facilitate the exchange with other applications, however, for demonstration of the completeness of the PoC it is not necessarily required.
As unique identifier for CVs, the SwissEduPersonUniqueID  has been chosen as it is guaranteed to be unique within the whole educational landscape of Switzerland. It also should be, according to its specification, non-transparent, meaning that it should not be possible to identify a person only through their SwissEduPersonUniqueID.
Another possibility for the identifier would have been the BUAS-specific shortcut of a person which would also be unique BUAS-wide. Eventually, the PoC will offer support for accessing a CV in addition using the BUAS-shortcut.
As described in section 5.2.1, access to a CV3.0 has to be limited depending on the inquiring party. The inquiring party thus has to be authenticated against the front-end service. In the PoC, this authentication will be performed using WebID  as well as a BUAS-specific WebID-"IDP" . Table 2 lists the different roles, how they are identified and a short description.
|Role||Identified by||Access to|
|CV3.0 Owner||BUAS WebID, matching unique ID ||Full CV3.0 content|
|BUAS User||BUAS WebID, different unique ID||Information commonly visible within BUAS|
|Foreign Party||Any WebID, possibly no unique ID||Depending on user consent |
|Anonymous||Unauthenticated||Only minimal content|
A BUAS-specific WebID-"IDP" will also be implemented separately. It provides a BUAS-specific WebID-profile containing the unique ID of the user.
As multiple and different kinds of inquiring parties access a CV, different amount and detail of CV-content is delivered. The owner of a CV must be able to decide to which inquiring party which level of access is given. Multiple possibilities of implementing the user-consent have been identified:
- Automatic consent: a set of inquiring parties is defined and the user can choose the level of access for each one independently
- Per-request consent: upon each request of an inquiring party, the user can decide to grant the request or deny it
- Fixed definitions of inquiring parties and user consent (PoC only)
In the first case, the inquiring parties are known to the CV3.0-environment and the user can decide to allow one or multiple of them individually. This approach can be compared to the "app-model" of Facebook, Twitter and the like.
In the second case, every request for content which is not available by default to an inquiring party requires user interaction. This is a semi-automatic, asynchronous approach.
For the PoC-implementation, in order to keep the development effort reasonable, a third option has been chosen: A predefined set of inquiring parties is defined within the PoC and user-consent is matched statically against this set using a lookup-table.
CV RDF Data Format
As part of his bachelor thesis, Joel Cabral develops the RDF-ontology to represent the contents of a CV3.0 in RDF. This ontology will be used as the base model for the PoC-implementation of CV3.0, the REST-endpoint will return RDF-data according to the specification given by it.
In a CV3.0, digital signatures take place at two locations:
- In the Organisational Document Facility: Storage of digitally signed documents
- In the CV3.0-Content itself: Digitally signed RDF data
While the first case is out of scope of the PoC, the implementation has to handle the second case.
Digitally signing RDF brings up various problems (like for instance normalisation and serialisation), which have already been addressed theoretically as well as with actual implementations. For the theoretical background, refer to Carroll .
Payswarm  has implemented signing, verification as well as encryption/decryption for JSON-LD as part of their client-API . The PoC will base the digital signature of it's CV3.0-content on this work.
Additionally, a small client for the PoC will demonstrate the validation of the digital signature on the side of the inquiring party. Implementation will be provided
as standalone application or integrated as web-application in the content delivered by the
At a given point in time, a new CV has to be instantiated in the
cv3cas, two different approaches can be taken:
- Create a new CV3.0 as soon as the owner of the CV3.0 logs in for the first time
- Periodically create CV3.0s as a batch process
In the first case, upon initial login of the user, documents and attributes from all organisational facilities are retrieved and the CV3.0 is created. In the second case, a periodical batch job (for instance running at the end of each semester or year) gathers all data from the organisational facilities for a specified subset of CV3.0-owners.
It currently has to be determined, which possibilities exist in accessing the organisational facilities at BUAS for the PoC. The possibilities given determine the implementation of the CV3.0-instantiation.
- BFH Homepage, http://www.bfh.ch
- W3C - RDF Primer, http://www.w3.org/TR/rdf-primer
- Wikipedia - Transport Layer Security, https://en.wikipedia.org/wiki/Transport_Layer_Security"
- Wikipedia - Document Management System, https://en.wikipedia.org/wiki/Document_management_system
- Wikipedia - Identity Management, https://en.wikipedia.org/wiki/Identity_management
- Wikipedia - Federated Identity Management, https://en.wikipedia.org/wiki/Federated_identity_management
- Wikipedia - Directory Service, https://en.wikipedia.org/wiki/Directory_service
- SWITCH Authentication, https://www.switch.ch/aai
- SuisseID Homepage, http://www.suisseid.ch
- git Homepage, http://www.git-scm.com
- git - Git Objects, https://www.git-scm.com/book/en/Git-Internals-Git-Objects
- pm - CV3.0, https://denethor.bfh.ch/projects/iam-group-cv3-0
- Stackoverflow - RFC3161 Timestamps, http://stackoverflow.com/questions/11913228/how-can-i-use-rfc3161-trusted-timestamps-to-prove-the-age-of-commits-in-my-git
- Node.js Homepage, http://nodejs.org
- Apache Jena - Fuseki, https://jena.apache.org/documentation/serving_data/
- W3C - Triple Store, http://www.w3.org/2001/sw/Europe/events/20031113-storage/positions/rusher.html
- University of California - REST, https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm
- W3C - HTTP, http://www.w3.org/Protocols/rfc2616/rfc2616.html
- W3C - Content Negotiation, http://www.w3.org/Protocols/rfc2616/rfc2616-sec12.html
- W3C - Turtle, http://www.w3.org/TR/turtle/
- RDF+XML, https://tools.ietf.org/html/rfc3870
- JSON Homepage, http://json.org
- JSON-LD, http://json-ld.org/spec/ED/json-ld-syntax/20120522
- Uduvudu, http://uduvudu.me
- SWITCH - swissEduPersonUniqueID, https://www.switch.ch/aai/support/documents/attributes
- W3C WebID Community Group, http://www.w3.org/2005/Incubator/webid/wiki/Main_Page
- BFH WebID Identity Provider, http://webidp.bfh.ch
- See section 8.3.2, Unique Identifier
- See section 8.3.5, User Consent
- W3C - Web Access Control, http://www.w3.org/wiki/WebAccessControl
- Wimmics - Shi3ld, http://wimmics.inria.fr/projects/shi3ld/
- Signing RDF Graphs, http://www.hpl.hp.com/techreports/2003/HPL-2003-142.pdf
- Web Payments, http://payswarm.com
- GitHub - Payswarm Client, https://github.com/digitalbazaar/payswarm.js/blob/master/lib/payswarm-client.js
- W3C - Cert Ontology, http://www.w3.org/ns/auth/cert