Scriptella ETL 工具操作筆記

2013/03/26 張貼者: Damon.Huang
Open Source 的 ETL(Extract-Transform-Load) 工具還不少,像 Pentaho 的 ETL-Kettle 的功能就已經很接微軟的SSIS。 不過用起來感覺上挺複雜的,測過了Scriptella後覺得,如果應用上不是太複雜的話 ,算是可以做一些不錯的應用, 也由於是用 Java 寫的,所以移植性更佳。

安裝及環境設定

1. 下載程式
scriptella

2. 設定環境參數:
Unix
# vi ~/.bashrc
加入以下內容
export JAVA_HOME=/usr/lib/jvm/java-6-sun
export SCRIPT_HOME=/usr/local/ETL/scriptella-1.0
PATH=$PATH:$JAVA_HOME/bin:$SCRIPT_HOME/bin
Windows
系統內容
set PATH=%PATH%;

基本操作方式

<!DOCTYPE etl SYSTEM "http://scriptella.javaforge.com/dtd/etl.dtd">
<etl>
    <description></description>
    <properties>
        <include href="db.properties"/>
    </properties>
    
    <!-- Oracle EB2 -->
    <connection id="OracleConnOut" 
                driver="oracle.jdbc.driver.OracleDriver" 
                url="jdbc:oracle:thin:@172.18.0.81:1521:DB2" 
                user="scott" password="tiger"/>

    <!-- SQL Server -->
    <connection id="OracleConnIn" 
                driver="oracle.jdbc.driver.OracleDriver" 
                url="jdbc:oracle:thin:@172.18.3.20:1521:DB2" 
                user="CBPM" password="CBPM"/>
    
    <script connection-id="out">
        <include href="dbschema.sql"/>
    </script>
    <query connection-id="in">
        SELECT * from Bug
        <script connection-id="out">
            INSERT INTO Bug VALUES (?ID, ?priority, ?summary, ?status);
        </script>
    </query>
</etl>

Using JavaScript for ETL transformations

<!DOCTYPE etl SYSTEM "http://scriptella.javaforge.com/dtd/etl.dtd">
<etl>
    <connection id="db" driver="auto" url="jdbc:hsqldb:mem:tst" 
                   user="sa" password="" classpath="../lib/hsqldb.jar"/>
    <connection id="js" driver="script"/>

    <!-- For printing debug information on the console -->
    <connection id="log" driver="text"/> 

    <script connection-id="db">
        CREATE TABLE Table_In (
            Error_Code INT
        );
        CREATE TABLE Table_Out (
            Error VARCHAR(10)
        );
        
        INSERT INTO Table_IN VALUES (1);
        INSERT INTO Table_IN VALUES (7);
    </script>

    <query connection-id="db">
        SELECT * FROM Table_In
        <script connection-id="log">
            Transforming $Error_Code
        </script>
        <!-- Transformation is described as an enclosing query
         which is executed before nested elements -->
        <query connection-id="js"> 
            <![CDATA[
               if (Error_Code < 5) {
                    Error_Code='WARNING'; //Set a transformed value
               } else {
                    Error_Code='ERROR'; //Set a transformed value
               }
               query.next(); //Don't forget to trigger nested scripts execution
            ]]>
            <script connection-id="db">
                <!-- Insert transformed value -->
                INSERT INTO Table_Out VALUES (?Error_Code); 
            </script>
            <script connection-id="log">
                Transformed to $Error_Code
            </script>
        </query>
    </query>
</etl>

從 Excel CSV 檔載入到資料庫

CSV 檔案格式如下:
id,priority,summary,status
1,Critical,NullPointerException in Main class,Open
5,Low,"Checkstyle, PMD, Findbugs issues",Reopened
7,Low,Maven integration,Open
10,High,SPI API,Closed

etl.xml 範例如下:
<!DOCTYPE etl SYSTEM "http://scriptella.javaforge.com/dtd/etl.dtd">
<etl>
  <connection id="in" driver="csv" url="data.csv"/>
  <connection id="out" 
              driver="oracle" url="jdbc:oracle:thin:@localhost:1521:ORCL" 
              classpath="ojdbc14.jar" user="scott" password="tiger"/>

  <!-- Copy all CSV rows to a database table -->
  <query connection-id="in">
      <!-- Empty query means select all columns -->
      <script connection-id="out">
          INSERT INTO Table_Name VALUES (?id,?priority, ?summary, ?status)
      </script>
  </query>
</etl>

Reference

標籤:
  1. Look at the way my buddy Wesley Virgin's biography starts with this SHOCKING AND CONTROVERSIAL video.

    You see, Wesley was in the army-and soon after leaving-he unveiled hidden, "MIND CONTROL" tactics that the government and others used to get everything they want.

    These are the same methods many famous people (especially those who "became famous out of nowhere") and the greatest business people used to become wealthy and famous.

    You probably know that you use less than 10% of your brain.

    Really, that's because the majority of your brainpower is UNCONSCIOUS.

    Perhaps that thought has even occurred IN YOUR own mind... as it did in my good friend Wesley Virgin's mind about 7 years ago, while riding a non-registered, trash bucket of a vehicle without a driver's license and $3 on his banking card.

    "I'm absolutely fed up with living check to check! When will I become successful?"

    You've taken part in those types of questions, ain't it so?

    Your very own success story is waiting to start. You need to start believing in YOURSELF.

    UNLOCK YOUR SECRET BRAINPOWER

張貼留言

技術提供:Blogger.