We have serious performance issue in Estonian Energy live environment where batch process which reads data from database and inserts xml messages to Oracle AQ queue performed astonishing slowly in test environment and even worse in live environment. Whole batch process was written in plsql and it take 1,5days to fix this issue.
Actual problem was that it was really difficult to understand where is problem, because all database select/insert/update statements performed well and it was shortly clear that problem must be somewhere else. But where?
Database administrator pointed to procedure where xml was put together from plain text and DBMS_LOB.append was used to concatenate strings. To make things more complicated there was almost identical process which performed well in all environments. The main difference between these 2 procedures was that in first one there was DBMS_LOB.append called 30 times per xml message and in second (slooow) one 744 times.
I started to investigate in internet about fast text concatenation strategies with CLOB-s in Oracle plsql and find the article from Martin Giffy D'Souza where actual life saver came from last comment(!) (app procedure).
Idea is that to append small text chunks to CLOB use actually to buffers - VARCHAR2(32767) and CLOB. Every time you need to append new chunk to clob you first try to append to varchar2 buffer and if it fails then append varchar2 buffer to clob and assign chunk to varchar2 buffer. In this scenario we will use CLOB.append only cases where text is longer then 32767 and all other cases we use simple text concatenation. Actually I replace CLOB.append with CLOB.writeappend because in test it was little bit faster.
Procedure to implement CLOB/VARCHAR2 concatenation:
-- fast varchar2/clob text concatenation
PROCEDURE append_text(io_clob IN OUT NOCOPY CLOB
, io_buffer IN OUT NOCOPY VARCHAR2
, i_text IN VARCHAR2) IS
BEGIN
io_buffer := io_buffer || i_text;
EXCEPTION
WHEN VALUE_ERROR THEN
IF io_clob IS NULL THEN
io_clob := io_buffer;
ELSE
DBMS_LOB.writeappend(io_clob
, LENGTH(io_buffer)
, io_buffer);
io_buffer := i_text;
END IF;
END append_text;
Everywhere in my code I replace
DBMS_LOB.append(m_xml, m_tmp);
-- m_xml temporary clob created with DBMS_LOB.createtemporary(m_xml, TRUE);
-- m_tmp varchar2(32767) - xml chunk, what must be appended to end of clob
with
append_text(m_xml, m_buffer, m_tmp);
-- m_buffer procedure varchar2(32767) variable
and additionally before to return clob from function I added this line
DBMS_LOB.writeappend (m_xml, LENGTH(m_buffer), m_buffer);
-- add varchar2 buffer to clob
It came out that only this small change made 4-5times overall performance gain, so we were able to process 80000-90000 source records per minute instead of 20000 records with previous version.
Database version: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0
Final code:
FUNCTION get_consumption (i_intervaldata_xml_tbl IN t_intervaldata_xml_tbl)
RETURN CLOB IS
m_xml CLOB;
m_tmp VARCHAR(32767);
m_buffer VARCHAR(32767); m_in_qty VARCHAR(100);
m_out_qty VARCHAR(100);
m_first_intervaldata_xml_rec t_intervaldata_xml_rec;
m_last_intervaldata_xml_rec t_intervaldata_xml_rec;
m_timezone VARCHAR2(30);
BEGIN
DBMS_LOB.createtemporary(m_xml, TRUE);
m_first_intervaldata_xml_rec := i_intervaldata_xml_tbl(i_intervaldata_xml_tbl.FIRST);
m_last_intervaldata_xml_rec := i_intervaldata_xml_tbl(i_intervaldata_xml_tbl.LAST);
m_timezone := get_timezone(m_first_intervaldata_xml_rec.intervaldata_type);
m_tmp := get_xml_header() || b_tools.c_eol ||
'<EnergyReport xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="EnergyReport.xsd">' || b_tools.c_eol ||
' <DocumentIdentification v="' || b_tools.int2str(m_first_intervaldata_xml_rec.document_id) || '"/>' || b_tools.c_eol ||
' <SenderIdentification v="'|| g_sender_identification ||'"/>' || b_tools.c_eol ||
' <ReceiverIdentification v="'|| g_receiver_identification ||'"/>' || b_tools.c_eol ||
' <DocumentDateTime v="' || b_tools.dtime2str(sysdate, 'Y') || '"/>' || b_tools.c_eol ||
' <AccountTimeSeries>' || b_tools.c_eol ||
' <MeasurementUnit v="'|| g_measurement_unit ||'" />' || b_tools.c_eol ||
' <AccountingPoint v="' || m_first_intervaldata_xml_rec.metering_point_eic || '"/>' || b_tools.c_eol ||
' <Period>' || b_tools.c_eol ||
' <TimeInterval v="' || b_tools.dtime2str(TRUNC(m_first_intervaldata_xml_rec.start_dtime,'HH24'), 'N', m_timezone) || '/' || b_tools.dtime2str(TRUNC(m_last_intervaldata_xml_rec.stop_dtime,'HH24') + 1/24, 'N', m_timezone) || '" />' || b_tools.c_eol ||
' <Resolution v="'|| g_resolution ||'" />' || b_tools.c_eol;
append_text(m_xml, m_buffer, m_tmp);
FOR i IN i_intervaldata_xml_tbl.FIRST .. i_intervaldata_xml_tbl.LAST LOOP
m_out_qty := case when i_intervaldata_xml_tbl(i).outgoing_qty is null then null else ' <OutQty v="' || b_tools.qty2str(i_intervaldata_xml_tbl(i).outgoing_qty) || '" />' end;
m_in_qty := case when i_intervaldata_xml_tbl(i).incoming_qty is null then null else ' <InQty v="' || b_tools.qty2str(i_intervaldata_xml_tbl(i).incoming_qty) || '" />' end;
m_tmp :=
' <AccountInterval>' || b_tools.c_eol ||
' <Pos v="' || b_tools.int2str(i_intervaldata_xml_tbl(i).position) || '"/>' || b_tools.c_eol ||
m_out_qty || b_tools.c_eol ||
m_in_qty || b_tools.c_eol ||
' </AccountInterval>' || b_tools.c_eol;
append_text(m_xml, m_buffer, m_tmp);
END LOOP;
m_tmp :=
' </Period>' || b_tools.c_eol ||
' </AccountTimeSeries>' || b_tools.c_eol ||
' </EnergyReport>';
append_text(m_xml, m_buffer, m_tmp);
DBMS_LOB.writeappend (m_xml, LENGTH(m_buffer), m_buffer);
RETURN m_xml;
END get_consumption;