Leo Leong's Blog: 2012

Monday, October 8, 2012

Database - SQL Server - Logging

I would like to talk about stored procedure logging today. Many DBAs or database developers always have a so called "logging framework" in their applications. I would like to share my "logging framework" in this blog. I hope you guys can provide some thought to make it more robust.

To me, logging and audit trail should be separated. So, I try not to store who and when someone ran a stored procedure in my application.

What do we log?
1. Execution time
a. start and end execution time of a stored procedure
b. start and end execution time of a step in a stored procedure
2. Error
a. catch and store error information for troubleshooting later

Why do we log?
1. Enable us to identify slow running stored procedures
2. Give us some hints to take proactive actions before a stored procedure screws up our application/database/server
3. Resolve errors that occurred during run-time (improve application quality by removing possible flaw in the stored procedure)

When do we log?
IMO, we don't log all stored procedures. So, I would log stored procedures that only involve operations below:
1. ETL stored procedure
2. Report stored procedure
3. Stored procedures that contain complex business logic
For those stored procedures that need to be executed FAST and required to support very high concurrency, I don't recommend to add logging to the stored procedure. Each byte that we use for disk IO access is very important. For the same reason, each variable that we declare in the stored procedure does consume memory. Well, we shouldn't discuss about potential performance issues in this blog.

My requirements of logging
1. Able to be turned on/off at stored procedure level
2. Able to store the log information in table or output the log information on screen immediately
3. Able to maintain the link between execution step info and error message
4. Fast - only sequential INSERTs (perhaps minimum UPDATE is acceptable)
5. Log data is easy to analyze

Open for discussion
1. Do you prefer storing logging tables on separate data files?
2. Would you centralize all logging in a dedicated database/server?

Any comment is most welcome. :)

Codes:
IF NOT EXISTS(SELECT 1 FROM sys.schemas WHERE name='ERROR')

EXECUTE ('CREATE SCHEMA ERROR')
GO
IF OBJECT_ID('ERROR.error') IS NOT NULL
DROP TABLE ERROR.error
GO
CREATE TABLE ERROR.error(
ErrorDateTime SMALLDATETIME NOT NULL CONSTRAINT DF_ErrorDateTime DEFAULT GETUTCDATE(),
ErrorNumber INT NOT NULL,
ErrorSeverity INT NOT NULL,
ErrorState INT NOT NULL,
ErrorLine INT NOT NULL,
traceSummaryID INT NOT NULL,
ErrorProcedure NVARCHAR(126) NOT NULL CONSTRAINT DF_ErrorProcedure DEFAULT '',
ErrorMessage NVARCHAR(2048) NOT NULL CONSTRAINT DF_ErrorMessage DEFAULT ''
)
GO
IF OBJECT_ID('ERROR.INS_error') IS NOT NULL
DROP PROCEDURE ERROR.INS_error
GO
CREATE PROCEDURE ERROR.INS_error
@iTraceSummaryID INT,
@iPrintOnScreen BIT=0 --developer wants to see the output on screen only
AS
BEGIN
SET NOCOUNT ON;
IF @iPrintOnScreen=1
PRINT 'Err '+CONVERT(VARCHAR,ERROR_NUMBER())+',Severity '+CONVERT(VARCHAR,ERROR_SEVERITY())+', State '+CONVERT(VARCHAR,ERROR_STATE())+', Line '+CONVERT(VARCHAR,ERROR_LINE())+' - '+ERROR_MESSAGE()
ELSE
INSERT INTO ERROR.error(ErrorNumber,ErrorSeverity,ErrorState,ErrorProcedure,ErrorLine,ErrorMessage,traceSummaryID)
SELECT
ISNULL(ERROR_NUMBER(),0) AS ErrorNumber
,ISNULL(ERROR_SEVERITY(),0) AS ErrorSeverity
,ISNULL(ERROR_STATE(),0) AS ErrorState
,ISNULL(ERROR_PROCEDURE(),'') AS ErrorProcedure
,ISNULL(ERROR_LINE(),0) AS ErrorLine
,ISNULL(ERROR_MESSAGE(),'') AS ErrorMessage
,ISNULL(@iTraceSummaryID,0) AS traceSummaryID;
END
GO
IF NOT EXISTS(SELECT 1 FROM sys.schemas WHERE name='CONFIG')
EXECUTE ('CREATE SCHEMA CONFIG')
GO
IF OBJECT_ID('CONFIG.procReturnCode') IS NOT NULL
DROP TABLE CONFIG.procReturnCode
GO
-- List of Return Codes that must be same across all environments
CREATE TABLE CONFIG.procReturnCode(
procReturnCodeID TINYINT NOT NULL,
name NVARCHAR(120) NOT NULL
)
GO
INSERT INTO CONFIG.procReturnCode(procReturnCodeID,name)
SELECT 0,'OK' UNION ALL
SELECT 10,'Error-Schema' UNION ALL
SELECT 30,'Error-Business Logic' UNION ALL
SELECT 100,'Error-Data Validation' UNION ALL
SELECT 240,'Error-Others';
GO
IF NOT EXISTS(SELECT 1 FROM sys.schemas WHERE name='LOGGING')
EXECUTE ('CREATE SCHEMA LOGGING')
GO
IF OBJECT_ID('LOGGING.traceSummary') IS NOT NULL
DROP TABLE LOGGING.traceSummary
GO
CREATE TABLE LOGGING.traceSummary(
traceSummaryID INT IDENTITY(-2147483648,1) CONSTRAINT PK_traceSummary PRIMARY KEY,
procName NVARCHAR(261) NOT NULL
)
GO
IF OBJECT_ID('LOGGING.traceDetail') IS NOT NULL
DROP TABLE LOGGING.traceDetail
GO
CREATE TABLE LOGGING.traceDetail(
traceSummaryID INT NOT NULL,
step TINYINT NOT NULL CONSTRAINT DF_traceDetail_step DEFAULT 0,
traceDateTime DATETIME NOT NULL CONSTRAINT DF_traceDetail_traceDateTime DEFAULT GETUTCDATE(),
stepName NVARCHAR(128) NOT NULL CONSTRAINT DF_traceDetail_stepName DEFAULT ''
)
GO
CREATE NONCLUSTERED INDEX IDX_LOGGING_traceDetail_1 ON LOGGING.traceDetail(traceSummaryID) INCLUDE(step)
GO
IF OBJECT_ID('LOGGING.INS_trace') IS NOT NULL
DROP PROCEDURE LOGGING.INS_trace
GO
CREATE PROCEDURE LOGGING.INS_trace
@iTraceSummaryID INT,
@iProcName NVARCHAR(261),
@iStepName NVARCHAR(128)='', --optional, for developer to mark a step only
@iPrintOnScreen BIT=0 --developer wants to see the output on screen only
AS
BEGIN
IF @iPrintOnScreen=1
BEGIN
PRINT CONVERT(VARCHAR,GETUTCDATE(),121)+'-'+@iStepName
RETURN @iTraceSummaryID
END
IF @iTraceSummaryID=0
BEGIN
INSERT INTO LOGGING.traceSummary(procName) VALUES (@iProcName);
SET @iTraceSummaryID=SCOPE_IDENTITY();
END
INSERT INTO LOGGING.traceDetail(traceSummaryID,stepName,step)
SELECT @iTraceSummaryID,@iStepName
,(SELECT ISNULL(MAX(step),0)+1 FROM LOGGING.traceDetail WITH(NOLOCK) WHERE traceSummaryID=@iTraceSummaryID)

RETURN @iTraceSummaryID
END
GO
IF OBJECT_ID('dbo.template_Transaction') IS NOT NULL
DROP PROCEDURE dbo.template_Transaction
GO
CREATE PROCEDURE dbo.template_Transaction
@iParam INT,
@oParam INT OUTPUT
AS
BEGIN TRY
SET NOCOUNT ON;
-- ********************************************************************************************* //
-- Local Variables & Initialization
DECLARE @returnCode TINYINT=0
DECLARE @traceSummaryID INT=0
DECLARE @printOnScreen BIT=0
DECLARE @proc NVARCHAR(261)=QUOTENAME(OBJECT_SCHEMA_NAME(@@PROCID))+'.'+QUOTENAME(OBJECT_NAME(@@PROCID))

-- Logging
EXECUTE @traceSummaryID=LOGGING.INS_trace @traceSummaryID,@proc,'Start',@printOnScreen
-- ********************************************************************************************* //
-- Temporary Tables
-- CREATE TABLE #t(ID INT)
-- DECLARE @t AS TABLE(ID INT)
EXECUTE @traceSummaryID=LOGGING.INS_trace @traceSummaryID,@proc,'Temporary Table',@printOnScreen
-- Business Logic goes here

-- NOTE: Avoid calling LOGGING.INS_trace in the TRANSACTION!
-- 1. reduce LOCK on trace tables
-- 2. when rolled-back, everything is gone
-- Keep transaction small
BEGIN TRANSACTION

END_OF_SP:
COMMIT
-- Logging
EXECUTE @traceSummaryID=LOGGING.INS_trace @traceSummaryID,@proc,'End',@printOnScreen
END TRY
BEGIN CATCH
IF XACT_STATE()= 1 COMMIT
IF XACT_STATE()=-1 ROLLBACK
-- execute everything else after COMMIT/ROLLBACK
EXECUTE ERROR.INS_error @traceSummaryID,@printOnScreen --establish a link between logging and error handling
RETURN @returnCode -- refer CONFIG.procReturnCode
END CATCH
GO

Friday, July 27, 2012

Database: Unique Identifier or Identity

In most of the RDBMS, DBA or DB developer has always considering whether he/she should use Unique Identifier (GUID) or Identity (Auto-Increment, sequence etc).

To be frank, many experienced DBAs have already known the answer long long time ago. Perhaps, some of them might have already posted this topic somewhere else on the Internet.

So, what do experts recommend? IDENTITY, if possible.

Why?
1. IDENTITY column is always unique and sorted in a sequence that we want (ascending or descending)
2. With these unique characteristics, it saves a lot of resources when INSERTING data into the table
i. new data always goes to the bottom of the BTree/Heap
ii. database engine itself does not need to adjust the BTree when new data come in
iii. it is much easier to partition the data (well, it depends on your data partitioning strategy)
iv. if this column is a PK or clustered index, it will impact the storage requirement of your non-clustered indexes. (For MSSQL, Unique Identifier is 16 bytes while BIGINT is 8 bytes) As you know, each non-clustered index will contain your clustered index key.
3. You hardly perform UPDATE on IDENTITY column (and it is not a good practice to update IDENTITY column)

However, there are some challenges that we might face when using IDENTITY
1. Limit of BIGINT is 9,223,372,036,854,775,807
- in BIG DATA era, it becomes possible to hit the limit
2. Data migration between databases is not straight forward
- we have to put extra effort to ensure 100% data integrity

Tuesday, May 22, 2012

Database Backup - Differences between Oracle, MSSQL, and MySQL

Oracle 10g
(refer http://docs.oracle.com/cd/A57673_01/DOC/server/doc/SCN73/ch23.htm)

A. Full Backup
i) NOARCHIVELOG mode - best for one-time database restore e.g. create testing/development database
ii) ARCHIVELOG mode - best for database recovery

B. Partial Backup
i) backup all datafiles under ONE tablespace
ii) backup SINGLE datafile
iii) backup a control file
> useful under ARCHIVELOG mode so that upon restore, changes will be applied based on archived redo log.

C. Incremental Backup
i) differential incremental backup - changes in data blocks since most recent backup at the same level or lower
ii) cumulative incremental backup - changes in data blocks since most recent backup of the next lowest level

MSSQL 2005/2008/2012
Concepts of Recovery Model
i) Simple - no support for log backup
ii) Bulk-Logged - minimal logging for bulk operation. DMLs are still logged. Bulk operation after last backup cannot be recovered.
iii) Full - support point-in-time recovery
(refer http://msdn.microsoft.com/en-us/library/ms189275.aspx)

A. Full Backup - data backup (all online data files and transaction logs) at certain point of time.
- if there is any offline datafile, full backup operation will fail. administrator has to perform partial backup instead.
(refer http://msdn.microsoft.com/en-us/library/ms175477.aspx)

B. Log Backup - changes in transaction log based on last full/differential/log backup (only available when recovery model is set to FULL or Bulk-Logged)
(refer http://msdn.microsoft.com/en-us/library/ms191429.aspx)

C. File Backup - backup all the data in one or more online files or filegroups (a filegroup that contains 1 or more offline datafile will be considered as offline filegroup. therefore, it cannot be backed up)
(refer http://msdn.microsoft.com/en-us/library/ms189860.aspx)

D. Partial Backup - useful when we want to exclude read-only filegroups
(refer http://msdn.microsoft.com/en-us/library/ms191539.aspx)

E. Differential Backup - changes since latest full/partial backup (a.k.a. cumulative incremental backup)

MySQL 5.0/5.1/5.5
(refer http://dev.mysql.com/doc/refman/5.0/en/backup-types.html)

A. Full Backup
- complete database backup
- Command: mysqldump (for MyISAM/Archive table, can use mysqlhotcopy)
- does not include INFORMATION_SCHEMA
- does not include PERFORMANCE_SCHEMA
- does not include ndbinfo
- does not include log/configuration file
(refer http://dev.mysql.com/doc/refman/5.5/en/mysqldump.html and http://dev.mysql.com/doc/refman/5.5/en/mysqlhotcopy.html)

B. Incremental Backup
- MUST enable --log-bin on startup (or specify log_bin and expire_log_days in my.cnf)
- flush bin log to file for database recovery later (similar to log backup)

C. Snapshot Backup - capability of file system. provided by third party (Veritas, LVM, ZFS)

Note: physical backup vs logical backup, online backup vs offline backup, import/export data and compressed backup are not covered in this article.

next post will discuss how to use database backups to perform recovery.

Sunday, April 22, 2012

PHP Yii SWFUpload 302

Possible Root Cause:
SWFUpload opens a new PHP session when connecting to Apache and uploading file(s) to web server's file system.

Solution:
1. Remember to pass in postParams
  $this->widget('application.extensions.swfupload.CSwfUpload', array(
'jsHandlerUrl'=>'js/swfupload/handlers.js',
'postParams'=>array('PHPSESSID'=>session_id()),
'config'=>array( ...

2. In the Controller, modify accessRules() to allow any user to call the action that is going to be used by SWFUpload.
Below is are sample codes in my View file:

$uploadUrl=$this->createUrl('photo/swfupload');
$this->widget('application.extensions.swfupload.CSwfUpload', array(
'jsHandlerUrl'=>'js/swfupload/handlers.js',
'postParams'=>array('PHPSESSID'=>session_id()),
'config'=>array(
'debug'=>false,
'use_query_string'=>true,
'upload_url'=>$uploadUrl,
'file_size_limit'=>'5 MB',
'file_types'=>'*.jpg;*.png;*.gif',
'file_types_description'=>'Image Files',
'file_upload_limit'=>0,
'file_queue_error_handler'=>'js:fileQueueError',
'file_dialog_complete_handler'=>'js:fileDialogComplete',
'upload_progress_handler'=>'js:uploadProgress',
'upload_error_handler'=>'js:uploadError',
'upload_success_handler'=>'js:uploadSuccess',
'upload_complete_handler'=>'js:uploadComplete',
'custom_settings'=>array('progressTarget'=>'divFileProgressContainer','cancelButtonId'=>'btnCancel'),
'button_placeholder_id'=>'swfupload',
  'button_image_url'=>'js/swfupload/images/TestImageNoText_100x20.png',
'button_width'=>50,
'button_height'=>20,
'button_text'=>' Select File(s) ',
'button_text_style'=>'.button { font-family: "Lucida Sans Unicode", "Lucida Grande", sans-serif; font-size: 9pt; text-align: center; }',
'button_text_top_padding'=>0,
'button_text_left_padding'=>0,
'button_window_mode'=>'js:SWFUpload.WINDOW_MODE.TRANSPARENT',
'button_cursor'=>'js:SWFUpload.CURSOR.HAND',
),
)
);
?>

$uploadUrl); ?>

As you can see, photo/swfupload is my controller/action pair。
So, in photoController file, we shall change the accessRules() function to allow anyone to call actionSwfupload(). Below are my sample codes:
public function accessRules()
{
return array(
array('allow', // allow all users to perform 'index' and 'view' actions
'actions'=>array('index','view','swfupload'),
'users'=>array('*'),
),
);
}

Please feel free to share your experience on this issue.

Monday, April 2, 2012

PHP Yii Complex SQL

Yii allows developers to construct SQL statement.
This will help advanced developer to tune his/her SQL query (by using appropriate indexes).

Example:
$connection=Yii::app()->db;

$sql = "SELECT t1.colA, t2.colB, t3.colC, t3.colD
FROM tableA t1
LEFT JOIN tableB t2 ON t1.id=t2.id
LEFT JOIN tableC t3 ON t2.pid=t3.pid
WHERE t1.id BETWEEN 10 AND 100
ORDER BY t1.modifyDate DESC";
$command = $connection->createCommand($sql);
$dataReader = $command->query();
$result = $dataReader->readAll();

Internal:
$command=createCommand($query)
Note: $command is a CDbCommand
|- $dataReader = query()
Note: $dataReader is a CDbDataReader
|- CDBDataReader->readAll() returns array()

Sunday, April 1, 2012

PHP Yii ActiveRecord FindAllBySql()

Definition:
public array findAllBySql(string $sql, array $params=array ( ))
(refer http://www.yiiframework.com/doc/api/1.1/CActiveRecord#findAllBySql-detail)

Explanation:
Find all active records using specified SQL statement

Example:
1. $model = MyModel::model()->findAllBySql('SELECT colA, colB FROM any_table');
2. $model = MyModel::model()->findAllBySql($sqlString);
3. $model = MyModel::model()->findAllBySql('SELECT * FROM any_table WHERE colA=:param', array(':param'=>$value));

Internal:

findAllBySql($sql, $param)
|-$command = createSqlCommand($sql,$params)
Note: $command is a CDbCommand
|-populateRecords($command->queryAll())

PHP Yii ActiveRecord FindAllByPk()

Definition:
public array findAllByPk(mixed $pk, mixed $condition='', array $params=array ( ))
(refer http://www.yiiframework.com/doc/api/1.1/CActiveRecord#findAllByPk-detail)

Explanation:
Find all active records with specified primary key

$pk can be either
1. string -> single-key primary key
2. array -> single-composite-key primary key
(refer createPkCriteria in yii\framework\db\schema\CDbCommandBuilder.php)

Example:
1. $model = MyModel::model()->findAllByPk($id);
2. $model = MyModel::model()->findAllByPk(array('pkCol1'=>$col1, 'pkCol2'=>$col2));

Internal:

findAllByPk($pk, $condition, $param)
|-$criteria = createPkCriteria($this->getTableSchema(),$pk,$condition,$params,$prefix)
Note: $criteria is a CDbCriteria
|-query($criteria, TRUE)
Note: TRUE means select all active records

Friday, March 30, 2012

PHP Yii ActiveRecord findAllByAttributes()

Definition:
public array findAllByAttributes(array $attributes, mixed $condition='', array $params=array ( ))
(refer http://www.yiiframework.com/doc/api/1.1/CActiveRecord#findAllByAttributes-detail)

Explanation:
Finds all active records that have specific attribute values

Example:
1. $model = MyModel::model()->findAllByAttributes(array('id'=>1, 'colA'='mystring'));
2. $model = MyModel::model()->findAllByAttributes(array('id'=>1, 'colA'='mystring'), "colB='mystring2'");
3. $model = MyModel::model()->findAllByAttributes(array('id'=>1, 'colA'='mystring'), "colB=:param", array(":param"=>'mystring2'));

Internal:

findAllByAttributes($attribute, $condition, $param)
|-$criteria = createColumnCriteria($this->getTableSchema(), $attributes, $condition, $param, $prefix)
Note: $criteria is a CDbCriteria
$prefix is defaulted to NULL. When calling findAllByAttributes, this function will not pass in $prefix. Therefore, in this case, $prefix is always NULL.
|-query($criteria, TRUE)

Thursday, March 29, 2012

PHP Yii ActiveRecord FindAll()

Definition:
public array findAll(mixed $condition='', array $params=array ( ))
(refer http://www.yiiframework.com/doc/api/1.1/CActiveRecord#findAll-detail)

Explanation:
Find all active records that satisfy specific conditions
findAll() returns array instead of CAtiveRecord

Example:
1. $model = MyModel::model()->findAll("id=1");

Internal:

please refer find() for explanation of $condition.

findAll($condition, $param)

|-$criteria = createCriteria($condition, $param)

> Note: $criteria is a CDbCriteria

|-query($criteria, TRUE)

Note: TRUE has been passed in so that system will retrieve all active records

(refer yii\framework\db\ar\CActiveRecord.php)

Tuesday, March 27, 2012

PHP Yii ActiveRecord Find()

Definition:

public CActiveRecord find(mixed $condition='', array $params=array ( ))

(refer http://www.yiiframework.com/doc/api/1.1/CActiveRecord#find-detail)

Explanation:

Find a single active record with specified conditions

Always return first record in the result set

$condition can be either

1. array

2. CDbCriteria

3. string

When $condition is an array, CDbCriteria class will map each key to predefined field name. Below is a list of predefined field name:

a. $select [default = '*']

b. $distinct [default = FALSE]

c. $condition [default = '']

d. $param [default = array()]

e. $limit [default = -1]

f. $offset [default = -1]

g. $order [default = '']

h. $group [default = '']

i. $join [default = '']

j. $having [default = '']

k. $with - alias of the table name [default = 't']
l. others - $alias, $together, $index, and $scopes

When $condition is a string, system will map it to CDbCriteria->$condition (item c above)

Example:

1. $model = MyModel::model()->find("id=1");

2. $model = MyModel::model()->find("id={$id}");

3. $model = MyModel::model()->find("value='mystring'");

4. $model = MyModel::model()->find("value='{$mystring}'");

5. $model = MyModel::model()->find("id=1 ORDER BY colA ASC");

6. $model = MyModel::model()->find("id=:param", array(":param"=>1));

7. $model = MyModel::model()->find("id=:param1 AND colA=:param2", array(":param1"=>1, ":param2"=>'mystring'));
8. $model = MyModel::model()->find(array('select'=>'colA', 'condition'=>'id=1', 'order'=>'colB DESC'));

Internal:

find($condition, $param)