Main   Products   Offshore Outsourcing   Customers   Partners   ContactUs  
JDBC Databases
  HXTT Access v5.2
  HXTT Cobol v2.1
  HXTT DBF v5.2
  HXTT Excel v4.2
  HXTT Paradox v5.2
  HXTT Text(CSV) v5.2
 
  Buy Now
  Support
  Download
  Document
  FAQ
  HXTT XML v1.2
Offshore Outsourcing
Oracle Data Import/Export
DB2 Data Import/Export
Sybase Data Import/Export
Free Resources
  Firewall Tunneling
  Search Indexing Robot
  Conditional Compilation
  Password Recovery for MS Access
  Password Recovery for Corel Paradox
  Checksum Tool for MD5
  Character Set Converter
  Pyramid - Poker of ZYH
   
   
   
Hongxin Technology & Trade Ltd. of Xiangtan City (abbr, HXTT)

HXTT Text(CSV)
UNION (ALL)
Frank
2006-09-13 15:49:25.0
Hi,

I have a speed problem with a select statement that contains UNION (ALL).
Let me explain: I have two selects that I want to combine on a 400.000 row table with 50 text fields.

the first select takes 2 minutes and takes the form:
select * from TABLE_X where COLUMN_Y = 'BlaBala'
The result is approximately 40.000 rows

the second select is similar to the first but the where clause is different and also takes 2 minutes:
select * from TABLE_X where COLUMN_Y = 'qdqsdqdq'

The result is approximately 40.000 rows

When I combine the two selects to one:
select * from TABLE_X where COLUMN_Y = 'BlaBala'
UNION
select * from TABLE_X where COLUMN_Y = 'qdqsdqdq'

This never returns (I stopped waiting after half an hour). Probably this is because with UNION only distinct rows are returned and that takes a long time?

Changing the statement to:
select * from TABLE_X where COLUMN_Y = 'BlaBala'
UNION ALL
select * from TABLE_X where COLUMN_Y = 'qdqsdqdq'

This makes the situation a bit better but it still takes at least 10 minutes instead of the 4 minutes that would have been expected since this statement is equivalent to executing the two select statements one after the other. Any idea what could be going wrong here?

Thanks,
Frank
Re:UNION (ALL)
HXTT Support
2006-09-13 23:34:23.0
What's your table type (text or CSV)? I simulated your situation, but failed to recur your issue in a ver slow speed. I guess that you're using CSV table.

>This never returns (I stopped waiting after half an hour). Probably this is
> because with UNION only distinct rows are returned and that takes a long
> time?
Yeah. The latest package has optimized UNION speed. You can download and try it.

>This makes the situation a bit better but it still takes at least 10 minutes
> instead of the 4 minutes that would have been expected since this statement
> is equivalent to executing the two select statements one after the other.
I han't seen such a situation in my test. Your 400.000 row table with 50 text fields should have many long string values so that Java VM wasted many minutes in memory swap. You can try to replace "select *" with "select columnList" to avoid to load unused values.

I'm enclosing two test case below:



import java.sql.*;
import java.util.Properties;

/* Used to test UNION speed */
/* on PIII733 400,000 rows (150MB)
Old test speed:
Sql:select * from TABLE_X where COLUMN_Y = 'BlaBala'
Spent time:18857ms
Row count:40000
Sql:select * from TABLE_X where COLUMN_Y = 'qdqsdqdq'
Spent time:18196ms
Row count:40000
Sql:select * from TABLE_X where COLUMN_Y = 'BlaBala' UNION select * from TABLE_X where COLUMN_Y = 'qdqsdqdq'
Spent time:43603ms
Row count:182
Sql:select * from TABLE_X where COLUMN_Y = 'BlaBala' UNION ALL select * from TABLE_X where COLUMN_Y = 'qdqsdqdq'
Spent time:32807ms
Row count:80000

New test speed:
Sql:select * from TABLE_X where COLUMN_Y = 'BlaBala'
Spent time:17796ms
Row count:40000
Sql:select * from TABLE_X where COLUMN_Y = 'qdqsdqdq'
Spent time:19087ms
Row count:40000
Sql:select * from TABLE_X where COLUMN_Y = 'BlaBala' UNION select * from TABLE_X where COLUMN_Y = 'qdqsdqdq'
Spent time:39116ms
Row count:182
Sql:select * from TABLE_X where COLUMN_Y = 'BlaBala' UNION ALL select * from TABLE_X where COLUMN_Y = 'qdqsdqdq'
Spent time:33108ms
Row count:80000


*/


public class testText19{
public static void main(String argv[]){
try{
Class.forName("com.hxtt.sql.text.TextDriver").newInstance();


String url="jdbc:text:////textfiles";

Properties properties=new Properties();

properties.setProperty("fileExtension", "txt");

Connection con = DriverManager.getConnection(url,properties);

Statement stmt = con.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,ResultSet.CONCUR_READ_ONLY);
String sql="create table TABLE_X (status varchar (2), notsure varchar (7), lastName varchar (20), firstName varchar (11), gender varchar (1), dob varchar (8), HealthCardNum varchar (12), healthCardversionNum varchar (2), number varchar (8), comments varchar (20), streetAdressLine1 varchar (30), "
+"city varchar (15), province varchar (3), postalCode varchar (6), homePhone varchar (10), workPhone varchar (10), secondAddress varchar (81), dunno varchar (49), dunno1 varchar (4), due varchar (8), due2 varchar (8), patKey2 varchar (3), morepadding varchar (16), flag varchar (1), eor varchar (35),"
+"COLUMN_Y varchar (14) ); ";
stmt.executeUpdate(sql);

if(false){//init data
sql="insert into TABLE_X (COLUMN_Y,status) values(?,?);";
PreparedStatement pstmt = con.prepareStatement(sql);
for(int i=0;i< 400000;i++){
if(i%10==1){
pstmt.setString(1,"BlaBala");
}else if(i%10==5){
pstmt.setString(1,"qdqsdqdq");
}else{
pstmt.setInt(1, i % 100);
}
if(i%130==0)
pstmt.setString(2,null);
else
pstmt.setInt(2,i%10000);
pstmt.executeUpdate();
}
pstmt.close();

}

String[] sqls=new String[]{
"select * from TABLE_X where COLUMN_Y = 'BlaBala' ",
"select * from TABLE_X where COLUMN_Y = 'qdqsdqdq' ",
"select * from TABLE_X where COLUMN_Y = 'BlaBala' UNION select * from TABLE_X where COLUMN_Y = 'qdqsdqdq' ",
"select * from TABLE_X where COLUMN_Y = 'BlaBala' UNION ALL select * from TABLE_X where COLUMN_Y = 'qdqsdqdq'"
};
for(int i=0;i< sqls.length;i++){
long currentTime=System.currentTimeMillis();
ResultSet rs = stmt.executeQuery(sqls[i]);

rs.beforeFirst();
int rowCount=0;
while (rs.next()) {
rowCount++;
}
currentTime=System.currentTimeMillis()-currentTime;

rs.close();

System.out.println("Sql:"+ sqls[i] );
System.out.println("Spent time:" +currentTime+"ms" );
System.out.println("Row count:" +rowCount);

}

stmt.close();

con.close();


}
catch( SQLException sqle )
{
do
{
System.out.println(sqle.getMessage());
System.out.println("Error Code:"+sqle.getErrorCode());
System.out.println("SQL State:"+sqle.getSQLState());
sqle.printStackTrace();
}while((sqle=sqle.getNextException())!=null);
}
catch( Exception e )
{
System.out.println(e.getMessage());
e.printStackTrace();
}
}
}




import java.sql.*;
import java.util.Properties;

/* Used to test UNION speed */
/* on PIII733 400,000 rows (12.50MB)
Old test speed:
Sql:select * from TABLE_X where COLUMN_Y = 'BlaBala'
Spent time:10936ms
Row count:40000
Sql:select * from TABLE_X where COLUMN_Y = 'qdqsdqdq'
Spent time:9473ms
Row count:40000
Sql:select * from TABLE_X where COLUMN_Y = 'BlaBala' UNION select * from TABLE_X where COLUMN_Y = 'qdqsdqdq'
Spent time:31316ms
Row count:182
Sql:select * from TABLE_X where COLUMN_Y = 'BlaBala' UNION ALL select * from TABLE_X where COLUMN_Y = 'qdqsdqdq'
Spent time:22101ms
Row count:80000

New test speed:
Sql:select * from TABLE_X where COLUMN_Y = 'BlaBala'
Spent time:29883ms
Row count:40000
Sql:select * from TABLE_X where COLUMN_Y = 'qdqsdqdq'
Spent time:10445ms
Row count:40000
Sql:select * from TABLE_X where COLUMN_Y = 'BlaBala' UNION select * from TABLE_X where COLUMN_Y = 'qdqsdqdq'
Spent time:22363ms
Row count:182
Sql:select * from TABLE_X where COLUMN_Y = 'BlaBala' UNION ALL select * from TABLE_X where COLUMN_Y = 'qdqsdqdq'
Spent time:21300ms
Row count:80000

*/


public class testText20{
public static void main(String argv[]){
try{
Class.forName("com.hxtt.sql.text.TextDriver").newInstance();


String url="jdbc:csv:////textfiles";

Properties properties=new Properties();

Connection con = DriverManager.getConnection(url,properties);

Statement stmt = con.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,ResultSet.CONCUR_READ_ONLY);
String sql="create table TABLE_X (status varchar (2), notsure varchar (7), lastName varchar (20), firstName varchar (11), gender varchar (1), dob varchar (8), HealthCardNum varchar (12), healthCardversionNum varchar (2), number varchar (8), comments varchar (20), streetAdressLine1 varchar (30), "
+"city varchar (15), province varchar (3), postalCode varchar (6), homePhone varchar (10), workPhone varchar (10), secondAddress varchar (81), dunno varchar (49), dunno1 varchar (4), due varchar (8), due2 varchar (8), patKey2 varchar (3), morepadding varchar (16), flag varchar (1), eor varchar (35),"
+"COLUMN_Y varchar (14) ); ";
stmt.executeUpdate(sql);



if(false){//init data
sql="insert into TABLE_X (COLUMN_Y,status) values(?,?);";
PreparedStatement pstmt = con.prepareStatement(sql);
for(int i=0;i< 400000;i++){
if(i%10==1){
pstmt.setString(1,"BlaBala");
}else if(i%10==5){
pstmt.setString(1,"qdqsdqdq");
}else{
pstmt.setInt(1, i % 100);
}
if(i%130==0)
pstmt.setString(2,null);
else
pstmt.setInt(2,i%10000);
pstmt.executeUpdate();
}
pstmt.close();

}

String[] sqls=new String[]{
"select * from TABLE_X where COLUMN_Y = 'BlaBala' ",
"select * from TABLE_X where COLUMN_Y = 'qdqsdqdq' ",
"select * from TABLE_X where COLUMN_Y = 'BlaBala' UNION select * from TABLE_X where COLUMN_Y = 'qdqsdqdq' ",
"select * from TABLE_X where COLUMN_Y = 'BlaBala' UNION ALL select * from TABLE_X where COLUMN_Y = 'qdqsdqdq'"
};
for(int i=0;i< sqls.length;i++){
long currentTime=System.currentTimeMillis();
ResultSet rs = stmt.executeQuery(sqls[i]);

rs.beforeFirst();
int rowCount=0;
while (rs.next()) {
rowCount++;
}
currentTime=System.currentTimeMillis()-currentTime;

rs.close();

System.out.println("Sql:"+ sqls[i] );
System.out.println("Spent time:" +currentTime+"ms" );
System.out.println("Row count:" +rowCount);

}

stmt.close();

con.close();


}
catch( SQLException sqle )
{
do
{
System.out.println(sqle.getMessage());
System.out.println("Error Code:"+sqle.getErrorCode());
System.out.println("SQL State:"+sqle.getSQLState());
sqle.printStackTrace();
}while((sqle=sqle.getNextException())!=null);
}
catch( Exception e )
{
System.out.println(e.getMessage());
e.printStackTrace();
}
}
}




Re:Re:UNION (ALL)
frank
2006-09-14 00:56:50.0
Thanks for the update, I'll give the new version a try.
You are right I'm using CSV.

Frank

Search Key   Search by Last 50 Questions




Google
 

Address: 9 Station Rd., Xiangtan City, Hunan Province, P.R. China
Postcode: 411100
Phone: (86)731-58225727
Fax: (86)731-58225727
Email: webmaster@hxtt.com
Copyright © 1999-2011 Hongxin Technology & Trade Ltd. | All Rights Reserved. | Privacy | Legal | Sitemap