Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
429 views
in Technique[技术] by (71.8m points)

sql server - SQL order by in two steps

I have a big table with two datetime columns.

[Timestamp] and [TimestampRounded]

The [Timestamp] column has the full timestamp including milliseconds and the table has no index for this column.

The [TimestampRounded] column has the timestamp but milliseconds, seconds, and minutes truncated (set to 0). The table has a clustered index for this column. That is, the table is effectively stored in the order of this column. Typically the newest row is on the top of the table. The index was created like this:

CREATE CLUSTERED INDEX cidx_time ON [dbo].[MyTable] ([TimestampRounded] DESC)

Now, I want to retrieve some data leveraging my clustered index so I do the following select, my table has around 5 million rows.

Query 1:

SELECT TOP(100) * FROM [dbo].[MyTable] ORDER BY [TimestampRounded] DESC

This query returns immediately (less than 1 second). But the 100 returned rows are not ordered with respect to milliseconds, only by hour.

Then I learned if I also want to order by a second column I do:

Query 2:

SELECT TOP(100) * FROM [dbo].[MyTable] ORDER BY [TimestampRounded] DESC, [Timestamp] DESC

This query is very slow and takes around 23 seconds to return the 100 rows.

My immediate solution was to use the first query and then just order those returned 100 rows in my client frontend code. But I experienced some problem that I missed rows that should have returned so I would like to understand how I can fix/rewrite query 2 to return those 100 sorted rows as expected, and by reasonable logic should also take less than 1 second. Since the table is already stored by hour (clustered index) I do not understand why it should take longer.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I might be oversimplifying, but why not simply create an index on the column that stores the entire timestamp?

CREATE INDEX cidx_time2 ON [dbo].[MyTable] ([Timestamp] DESC)

Then, you can just do:

SELECT TOP(100) * FROM [dbo].[MyTable] ORDER BY [[Timestamp] DESC

Or, if you need to two timestamps in the order by clause for some reason, then you want an index on both columns:

CREATE INDEX cidx_time3 ON [dbo].[MyTable] ([TimestampRounded] DESC, [Timestamp] DESC);

Then you can run your original query:

SELECT TOP(100) * FROM [dbo].[MyTable] ORDER BY [TimestampRounded] DESC, [Timestamp] DESC

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
...