Sunday, February 3, 2013

Performance issue with DBMS_LOB.append in plsql

We have serious performance issue in Estonian Energy live environment where batch process which reads data from database and inserts xml messages to Oracle AQ queue performed astonishing slowly in test environment and even worse in live environment. Whole batch process was written in plsql and it take 1,5days to fix this issue.

Actual problem was that it was really difficult to understand where is problem, because all database select/insert/update statements performed well and it was shortly clear that problem must be somewhere else. But where?

Database administrator pointed to procedure where xml was put together from plain text and DBMS_LOB.append was used to concatenate strings. To make things more complicated there was almost identical process which performed well in all environments. The main difference between these 2 procedures was that in first one there was DBMS_LOB.append called 30 times per xml message and in second (slooow) one 744 times.

I started to investigate in internet about fast text concatenation strategies with CLOB-s in Oracle plsql and find the article from Martin Giffy D'Souza where actual life saver came from last comment(!) (app procedure).

Idea is that to append small text chunks to CLOB use actually to buffers - VARCHAR2(32767) and CLOB. Every time you need to append new chunk to clob you first try to append to varchar2 buffer and if it fails then append varchar2 buffer to clob and assign chunk to varchar2 buffer. In this scenario we will use CLOB.append only cases where text is longer then 32767 and all other cases we use simple text concatenation. Actually I replace CLOB.append with CLOB.writeappend because in test it was little bit faster.

Procedure to implement CLOB/VARCHAR2 concatenation:
 -- fast varchar2/clob text concatenation
  PROCEDURE append_text(io_clob    IN OUT NOCOPY CLOB
                      , io_buffer  IN OUT NOCOPY VARCHAR2
                      , i_text     IN            VARCHAR2) IS
  BEGIN
    io_buffer := io_buffer || i_text;
  EXCEPTION
    WHEN VALUE_ERROR THEN
      IF io_clob IS NULL THEN
        io_clob := io_buffer;
      ELSE
        DBMS_LOB.writeappend(io_clob
                           , LENGTH(io_buffer)
                           , io_buffer);
        io_buffer := i_text;
      END IF;
  END append_text;


Everywhere in my code I replace
DBMS_LOB.append(m_xml, m_tmp);
-- m_xml temporary clob created with DBMS_LOB.createtemporary(m_xml, TRUE);
-- m_tmp varchar2(32767) - xml chunk, what must be appended to end of clob

with

append_text(m_xml, m_buffer, m_tmp);
-- m_buffer procedure varchar2(32767) variable

and additionally before to return clob from function I added this line
DBMS_LOB.writeappend (m_xml, LENGTH(m_buffer), m_buffer);
-- add varchar2 buffer to clob

It came out that only this small change made 4-5times overall performance gain, so we were able to process 80000-90000 source records per minute instead of 20000 records with previous version.

Database version: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0

Final code:
  FUNCTION get_consumption (i_intervaldata_xml_tbl IN t_intervaldata_xml_tbl)
    RETURN CLOB IS
    m_xml                     CLOB;
    m_tmp                     VARCHAR(32767);
    m_buffer                  VARCHAR(32767);
    m_in_qty                  VARCHAR(100);
    m_out_qty                 VARCHAR(100);
    m_first_intervaldata_xml_rec t_intervaldata_xml_rec;
    m_last_intervaldata_xml_rec  t_intervaldata_xml_rec;
    m_timezone                VARCHAR2(30);

  BEGIN
 
    DBMS_LOB.createtemporary(m_xml, TRUE);
    m_first_intervaldata_xml_rec := i_intervaldata_xml_tbl(i_intervaldata_xml_tbl.FIRST); 
    m_last_intervaldata_xml_rec := i_intervaldata_xml_tbl(i_intervaldata_xml_tbl.LAST); 
    m_timezone := get_timezone(m_first_intervaldata_xml_rec.intervaldata_type);
 
   m_tmp := get_xml_header() || b_tools.c_eol ||
    '<EnergyReport xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="EnergyReport.xsd">' || b_tools.c_eol ||
    ' <DocumentIdentification v="' || b_tools.int2str(m_first_intervaldata_xml_rec.document_id) || '"/>' || b_tools.c_eol ||
    ' <SenderIdentification v="'|| g_sender_identification ||'"/>' || b_tools.c_eol ||
    ' <ReceiverIdentification v="'|| g_receiver_identification ||'"/>' || b_tools.c_eol ||
    ' <DocumentDateTime v="' || b_tools.dtime2str(sysdate, 'Y') || '"/>' || b_tools.c_eol ||
    ' <AccountTimeSeries>' || b_tools.c_eol ||
    '  <MeasurementUnit v="'|| g_measurement_unit ||'" />' || b_tools.c_eol ||
    '  <AccountingPoint v="' || m_first_intervaldata_xml_rec.metering_point_eic || '"/>' || b_tools.c_eol ||
    '   <Period>' || b_tools.c_eol ||
    '    <TimeInterval v="' || b_tools.dtime2str(TRUNC(m_first_intervaldata_xml_rec.start_dtime,'HH24'), 'N', m_timezone) || '/' || b_tools.dtime2str(TRUNC(m_last_intervaldata_xml_rec.stop_dtime,'HH24') +  1/24, 'N', m_timezone) || '" />' || b_tools.c_eol ||
    '    <Resolution v="'|| g_resolution ||'" />' || b_tools.c_eol;
    append_text(m_xml, m_buffer, m_tmp);
   
    FOR i IN i_intervaldata_xml_tbl.FIRST .. i_intervaldata_xml_tbl.LAST LOOP
      m_out_qty := case when i_intervaldata_xml_tbl(i).outgoing_qty is null then null else '     <OutQty v="' || b_tools.qty2str(i_intervaldata_xml_tbl(i).outgoing_qty) || '" />' end;
      m_in_qty  := case when i_intervaldata_xml_tbl(i).incoming_qty is null then null else '     <InQty v="'  || b_tools.qty2str(i_intervaldata_xml_tbl(i).incoming_qty) || '" />' end;
      
      m_tmp :=
      '    <AccountInterval>' || b_tools.c_eol ||
      '     <Pos v="' || b_tools.int2str(i_intervaldata_xml_tbl(i).position) || '"/>' || b_tools.c_eol ||
      m_out_qty || b_tools.c_eol ||
      m_in_qty || b_tools.c_eol ||
      '    </AccountInterval>' || b_tools.c_eol;
      append_text(m_xml, m_buffer, m_tmp);
    END LOOP;
   
    m_tmp :=
    '   </Period>' || b_tools.c_eol ||
    '  </AccountTimeSeries>' || b_tools.c_eol ||
    ' </EnergyReport>';
    append_text(m_xml, m_buffer, m_tmp);
    DBMS_LOB.writeappend (m_xml, LENGTH(m_buffer), m_buffer);

    RETURN m_xml;  
  END get_consumption;  

2 comments:

  1. It appears there is a small bug in this implementation. When m_buffer is NULL, and therefore LENGTH(m_buffer) is NULL, DBMS_LOB.writeappend() throws an exception. I fixed this in my code by surrounding the very last DBMS_LOB.writeappend() with IF LENGTH(m_buffer) > 0 THEN ... END IF:

    By the way, thank you very much for posting this. The speedup of my string concatenation was tremendous. I was really frustrated with how slow appending to a CLOB was, but your code really made a difference.

    Jonathan

    ReplyDelete
  2. thank you very much, i tried it and found very useful ..
    thanks again

    ReplyDelete