Data Migration

Data Migration is required when organizations change from legacy systems to ]project-open[. Data such as general user & contact information needs to be made available within ]project-open[.

Since data sources have their very own data structure there's no silver bullet solution for the import. In general it can be said that for smaller data sets CVS import offered for users and companies is still considered the less troublesome approach. All you need to take care of is, that certain key values are unique in your CSV files. If you know your EXCEL well that should be quite easy to evaluate.

CSV Import can only be topped by creating individual SQL scripts. This approach should be chosen if you plan to import a large amount of data into ]po[ and requires some knowledge about PL/pgSQL scripting. "SQL for Web Nerds"  (philip.greenspun.com/sql/) is a great starting point and ]po[ provides a couple of scripts that you would need to adjusted

The general process consists usually of the following three steps: 

  • Data extraction
  • Data verification
  • Data loading

Data extraction

Data extraction describes the tasks that are necessary to make data available in a format so it can be imported in ]project-open[ using one of the existing data migration scripts.

Data verification

Data verification ensures that no orphan data or outdated data is copied into the new system. Data migration tasks are a great occasion to clean up your data.
This step is highly recommended and can either be performed in the legacy system or after data extraction phase based on the data exported to the interim format.

Data loading

 Data loading describes the tasks necessary to import data from an interim format to the ]project-open[ database.  

General rules & Comments 

  • Before you starting the migration process, have a look at the ]po[ DB model see what object attributes are part of the ]po[ Data Model and evaluate the need for DynFields to provide additional containers.
  • Start with Contacts and Employees, Persons are needed in all other imports

Transferring Data Objects  

Transferring users

Importing users into ]po[ sometimes requires cleaning up user source data in LDAP, NAV, [SAP PS] or other resources, because users in ]po[ represent natural persons with a number of constraints:

  • Every user needs to have a UNIQUE email address. ]po[ will not allow to enter duplicate email addresses or users without email.
    In practice, it is advisable to use the user email as the "primary key" field to identify a user, in orderto avoid duplicates.
    In case of a missing email address (for example pre-sales leads of which you only know thename) you can construct a fake email as first.last@nowhere.com or similar.

  • First name + Last name are UNIQUE:
    In case of ambiguities it is advisable to "massage" the source information to remove duplicates, for example by adding a "(user_id)" appendix in the last name.

  • The "username" usually contains the Windows domain\login of the user. Again, this field is UNIQUE.

Option 1: Using ]po[ CSV Import

For additional information please see package intranet-csv-import

Option 2: Using ]po[ import scripts - Flat Table Import

Scripts are available that have been used to import data from a customer organization into ]project-open[ during several project roll-outs.
Find them in the packages/intranet-core/import_sql/ folder of your local installation.

Procedure as follows: 

  • In a first step, we have exported all customer data into CSV format (MS-Excel).
  • In a second step, we paste the contents of the CSV files right into the "COPY" sections of the import.sql file. This way, the script automatically populates the import_*  tables with this data. 
  • In a third step the SQL script converts the data from the import_* tables into ]po[ objects.

Please do NOT expect that these scripts will work for you out of the box. Instead, the scripts are provided here as a code sample. You will have to adapt these structure of the import_* tables to your data structures and fix any errors ocurring during the conversion. Still, this is currently the most efficient method to import legacy data into ]project-open[, apart from the CSV import provided under "Companies" and "Users" tabs.

Option 3: LDAP User Integration

The easiest way to import users into ]po[ is to use the ]po[ LDAP interface. By default, the interface can batch- import  users and authentication users agains a single domain.
More complex schemes (multiple domains, multiple accounts per user etc. require customization of this interface.  Please note that ]po[ includes several "authentication authorities".
A user imported manually will authenticate against the local default authority (the ]po[ password database) (and not the LDAP), unless you change the user's authority (column "authority_id"
in table "users") to the LDAP authority.

Option 4: REST Interface

You can create new users and update existing users via the ]po[ REST interface. You can also change the authentication authority and the user's salt/password via the interface. Please see the OpenACS/]po[ system documentation on how to ]po[ stores hashed  passwords.

Option 5: PHP script

We once used PHP to create new users in the ]po[ system. See script below.

Transferring companies 

Option 1: Using ]po[ CSV Import

For additional information please see package intranet-csv-import

Option 2: Using ]po[ import scripts - Flat Table Import

Same procedure as described in "Transferring Users" 

Option 3:REST Interface

You can create new companies and update existing users via the ]po[ REST interface.

Transferring Financial Documents

Most of our clients abstain from importing Financial Document. They rather consult their legacy systems or refer to their accounting system that usually contains the relevant  information.

Option 1: REST Interface Import

The ]po[ REST interface allows for the creation of financial documents via REST style Web-service interface. So you could write an external application in Java, Perl, or any .Net
language to access this interface.

Option 2: PL/SQL (SQL) Import

]po[ provides a PL/SQL (procedural SQL) interface via ODBC or a direct databases connection for creating financial documents. So you could write an external application in Java, Perl or any other language that is capable to connect to a PostgreSQL
database.

Tools & Scripts

In various customer projects we have helped companies migrating their date to ]po[. The following resources have been made available to the community:

PHP Import

Stored Procedure for User Creation:

function acs__add_user (
integer,      -- user_id
varchar,      -- object_type
timestamptz,  -- creation_date
integer,      -- creation_user
varchar,      -- creation_ip
integer,      -- authority_id; default 'local'
varchar,      -- username
varchar,      -- email
varchar,      -- url
varchar,      -- first_names
varchar,      -- last_name
char,         -- password
char,         -- salt
varchar,      -- screen_name
boolean,      -- email_verified_p
varchar       -- member_state
)
returns integer

This is taken from acs-kernel/sql/postgresql/acs-create.sql . This function also creates entries in the persons, user_preferences and parties tables. Example User Creation (SQL):

select acs__add_user(
null, 'user', now(), null, '0.0.0.0', null,
'username',
'test@email.com',
null,
'Firstname', 'Lastname',
'password', 'salt',
'Screenname',
'f', 'approved'
) into v_user_id;

insert into users_contact (user_id) values (v_user_id);

For detailed examples please have a look at intranet-core/import_sql/import.lexcelera.sql Example User Creation (PHP pseudocode):

// Connecting, selecting database
$dbconn = pg_connect("host=localhost dbname=publishing user=www password=foo")
or die('Could not connect: ' . pg_last_error());

$firstname = "Paul";
$lastname = "Smith";
$email = "paul@smith.com";
$home_phone = "+49 228 555123 begin_of_the_skype_highlighting            +49 228 555123      end_of_the_skype_highlighting";

$result = pg_query("
select acs__add_user(
null, 'user', now(), null, '0.0.0.0', null,
'$firstname $lastname',
'$email',
null,
'$firstname', '$lastname',
'$lastname$firstname', 'salt',
'$firstname$lastname',
'f', 'approved'
)
") or die('Query failed: ' . pg_last_error());

$row = pg_fetch_array($result);
$user_id = $row[0];

pg_query("
insert into users_contact
(user_id,home_phone,note)
values
($user_id,'$home_phone','created by php pseudocode example')
") or die('Query failed: ' . pg_last_error());

// Closing connection
pg_close($dbconn);
?>;

Please also have a look at this TCL example from the project-open codebase: .../packages/intranet-core/www/users/upload-contacts-2.tcl

  Contact Us
  Project Open Business Solutions S.L.

Calle Aprestadora 19, 12o-2a

08902 Hospitalet de Llobregat (Barcelona)

Spain

 Tel Europe: +34 609 953 751
 Tel US: +1 415 200 2465
 Mail: info@project-open.com