Redshift connector forces SSIS to stop

Redshift related discussions

Redshift connector forces SSIS to stop

Postby danielbos » Thu Dec 05, 2013 12:35 am

Hi all,

I use the Redshift connector. Since I am testing the connector right now with Redshift I created a simple package which get data from table A (Redshift) and insert it into table B (also Redshift).
ScreenHunter_19 Dec. 05 08.24.gif
SSIS
ScreenHunter_19 Dec. 05 08.24.gif (13.56 KiB) Viewed 1816 times


My environment:
Fresh install of:
    Windows Server 2012
    SQL Server 2012 Standard

The settings I use for the PGNP driver are:
    Fetch Rows: 100000
    Bulk Insert Method: Pipe/Copy
    Bulk Insert Buffer Size: 100000
    AWS keys and such are filled in in order to use S3 for the copy operation.

After about 4 milion records the SSIS package fails and I have no idea how to solve this.

The errors that I could find in the Eventlog:
ScreenHunter_20 Dec. 05 08.30.gif
Error SSIS
ScreenHunter_20 Dec. 05 08.30.gif (5.4 KiB) Viewed 1816 times

Code: Select all
Application: DtExecUI.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: exception code c0000005, exception address 5C229DD3
Stack:


And:
ScreenHunter_21 Dec. 05 08.31.gif
Faulting module name: PGNP.dll, version: 1.4.0.3196, time stamp: 0x52788c8b
ScreenHunter_21 Dec. 05 08.31.gif (9.4 KiB) Viewed 1816 times


Code: Select all
Faulting application name: DtExecUI.exe, version: 11.0.3368.0, time stamp: 0x519d5ea7
Faulting module name: PGNP.dll, version: 1.4.0.3196, time stamp: 0x52788c8b
Exception code: 0xc0000005
Fault offset: 0x000c9dd3
Faulting process id: 0x1364
Faulting application start time: 0x01cef188fa510d8e
Faulting application path: C:\Program Files (x86)\Microsoft SQL Server\110\Tools\Binn\ManagementStudio\DtExecUI.exe
Faulting module path: C:\Program Files (x86)\Intellisoft\RSNP\PGNP.dll
Report Id: a74823df-5d7c-11e3-9439-22000a24224f
Faulting package full name:
Faulting package-relative application ID:


Does anybody know how to solve this?

Kind regards,

Daniel
danielbos
 
Posts: 8
Joined: Fri Oct 18, 2013 2:19 pm

Re: Redshift connector forces SSIS to stop

Postby Moderator » Thu Dec 05, 2013 6:31 am

Hi Daniel,
could you please send us the table DDL and the test package? We will try to reproduce and fix the issue.

Also, have you used FastLoad in the destination?

Thank you
Moderator
User avatar
Moderator
Site Admin
 
Posts: 314
Joined: Wed Oct 29, 2008 11:27 pm

Re: Redshift connector forces SSIS to stop

Postby danielbos » Fri Dec 06, 2013 12:48 am

Hi PGNP,

I have two simple tables, like:
Code: Select all
CREATE TABLE public.a (
  b  varchar(256)
) ;

CREATE TABLE public.b_test (
  b  varchar(256)
) ;


I didn't use the fastload option.
ScreenHunter_22 Dec. 06 08.22.gif
Not the fastload option
ScreenHunter_22 Dec. 06 08.22.gif (2.37 KiB) Viewed 1800 times


When I do, the performance drops significantly (without setting the number of rows).
ScreenHunter_24 Dec. 06 08.34.gif
Fastload without batchsize
ScreenHunter_24 Dec. 06 08.34.gif (3.67 KiB) Viewed 1800 times


When I set also the number of rows to insert in a batch, like below, I get an Out of memory error.
ScreenHunter_23 Dec. 06 08.33.gif
Fastload with batchsize
ScreenHunter_23 Dec. 06 08.33.gif (4.11 KiB) Viewed 1800 times

Code: Select all
TITLE: Package Validation Error
------------------------------
Package Validation Error
------------------------------
ADDITIONAL INFORMATION:

Error at Data Flow Task [OLE DB Source [26]]: SSIS Error Code DTS_E_OLEDBERROR.  An OLE DB error has occurred. Error code: 0x80040E14.
An OLE DB record is available.  Source: "Redshift OLE DB Provider"  Hresult: 0x80040E14  Description: "out of memory for query result
".

Error at Data Flow Task [OLE DB Source [26]]: Opening a rowset for ""public"."b_test"" failed. Check that the object exists in the database.
Error at Data Flow Task [SSIS.Pipeline]: "OLE DB Source" failed validation and returned validation status "VS_ISBROKEN".
Error at Data Flow Task [SSIS.Pipeline]: One or more component failed validation.
Error at Data Flow Task: There were errors during task validation.

(Microsoft.DataTransformationServices.VsIntegration)

------------------------------
BUTTONS:
OK
------------------------------


I have send you the DDL and package. Thank you for looking into it!

Daniel
danielbos
 
Posts: 8
Joined: Fri Oct 18, 2013 2:19 pm

Re: Redshift connector forces SSIS to stop

Postby Moderator » Sun Jan 05, 2014 9:50 pm

We are investigating the issue. As first step, we enabled using Redshift cursors in build 3206.

As recommendation, could you set "Rows per batch" to a smaller number, say 1000, and see if that resolves the out-of-memory issue.
User avatar
Moderator
Site Admin
 
Posts: 314
Joined: Wed Oct 29, 2008 11:27 pm

Re: Redshift connector forces SSIS to stop

Postby moodu » Tue Jan 07, 2014 12:53 pm

I had a similar issue. The pagefile for virtual memory was set to manual and it had a hard cap. I had more luck when i set it back to system managed. Worth a check.
moodu
 
Posts: 1
Joined: Tue Jan 07, 2014 12:52 pm

Re: Redshift connector forces SSIS to stop

Postby Moderator » Mon Feb 10, 2014 9:12 am

I have performed various profiling tests, and confirmed that there are neither memory leaks, nor memory exceptions generated by the provider.

Then I found the following article: http://social.msdn.microsoft.com/Forums ... onservices. The article describes same issue.

Then I increased DefaultBufferSize to 20485760 (in the Control Flow Properties). And it solved the issue! The test package was able to continue processing beyond 10 million rows. I think this issue was caused by the way the SSIS manages internal buffers.

Also, the package can be changed to use queries with LIMIT/OFFSET clauses to be able to process any number of rows (many billions).
User avatar
Moderator
Site Admin
 
Posts: 314
Joined: Wed Oct 29, 2008 11:27 pm


Return to Redshift OLE DB Provider

Who is online

Users browsing this forum: No registered users and 1 guest

cron