Tuesday, 27 August 2019

Pentaho Architecture

What is Pentaho Client-server Architecture?

       Let's take an example we have many jobs in our full data pipeline if will save everything in the local system it is not the great idea for that Pentaho has client-server architecture means you have a client is installed in your machine and you can connect with the server lets say in the cloud you have your server and at a time many of the clients can connect and create and run their jobs using the same server.


What you have to do for that?

    Create one cloud instance, for example, ect2 instance in the Amazon cloud in that machine you have to install Pentaho as we did in our local machine same thing we have to do here we have to extract our package inside the machine. After that, we have to install a database (any RDBMS database you can use like MySQL or PostgreSQL ), for example, we want to create a repository in the database so many users can connect with that repository at a time.

What we need to do for the Pentaho repository?

    After database installation,  we have to create one database and username password for that. In Pentaho, we have to do some configurations in our Pentaho kettle properties file. You will get that properties file inside  ~/.kettle directory in your system.

If already exist repositories.xml inside in your folder you can edit that file or you can create a new file
with the name repositories.xml


<?xml version="1.0" encoding="UTF-8"?>
<repositories>
  <connection>
    <name>Your Database Repository Name</name>
    <server>Your machine IP address</server>
    <type>Your Data base type(POSTGRESQL)</type>
    <access>Native</access>
    <database>Database Name</database>
    <port>Database port</port>
    <username>Database user name </username>
    <password>Database User Password</password>
    <servername/>
    <data_tablespace/>
    <index_tablespace/>
    <attributes>
      <attribute><code>FORCE_IDENTIFIERS_TO_LOWERCASE</code><attribute>N</attribute></attribute>
      <attribute><code>FORCE_IDENTIFIERS_TO_UPPERCASE</code><attribute>N</attribute></attribute>
      <attribute><code>IS_CLUSTERED</code><attribute>N</attribute></attribute>
      <attribute><code>PORT_NUMBER</code><attribute>5432</attribute></attribute>
      <attribute><code>PRESERVE_RESERVED_WORD_CASE</code><attribute>Y</attribute></attribute>
      <attribute><code>QUOTE_ALL_FIELDS</code><attribute>N</attribute></attribute>
      <attribute><code>SUPPORTS_BOOLEAN_DATA_TYPE</code><attribute>Y</attribute></attribute>
      <attribute><code>SUPPORTS_TIMESTAMP_DATA_TYPE</code><attribute>Y</attribute></attribute>
      <attribute><code>USE_POOLING</code><attribute>N</attribute></attribute>
    </attributes>
  </connection>
  <repository>    <id>KettleDatabaseRepository</id>
    <name>Repository Name</name>
    <description>Database repository</description>
    <is_default>true</is_default>
    <connection>Repository Name</connection>
  </repository>  <repository>    <id>KettleDatabaseRepository</id>
    <name>Repository Name</name>
    <description>Database repository</description>
    <is_default>false</is_default>
    <connection>Repository Name</connection>

  </repository>  </repositories>


 After that, you have to restart your carte server in your server machine using the command,

#cd /usr/local/data-integration
#nohup sh carte.sh 0.0.0.0 8181 > "/usr/local/data-integration/carte_logs/carte.err.log" &


Now your server is up and running you can run any job inside the server or you can connect
the client machine with the database repository and run the job in the client as well.

How to connect from client to server machine?

Click on connect button

Click on repository manager


Click add to create a new repository


Click on get started

Create repository here gives the name to your repository mention your server URL and save.                 


After that, you will get the name of your created directory in your connect option.                              

    When you will connect with the repository it will ask username and password of your repository fill the user name and password and you will be able to connect with the repository.                             

Now everything is done and your client-server architecture of the Pentaho is ready to use. You can create a job or transformation and save inside your root directory or you can create a new directory as you like it up to you.                                                                                                                                                                                                                                                     
   















No comments:

Post a Comment