[Solved] T-SQL to generate expect result


Assuming that I understand the problem:

  • you have a number of rows with sections (“Cates”) and symbols (“Type”);
  • if there are any symbols ending in a minus sign then these indicate a row without a minus sign should be removed;
  • symbols are never “mixed” per section, i.e. a section can never have “A” and “B-“;
  • there will always be a row to remove if there is a type with a minus;
  • rows should be removed starting with the lowest Id.

Then this should work:

DECLARE @data TABLE (
    Id INT,
    Cate VARCHAR(5),
    [Type] VARCHAR(2));
INSERT INTO @data SELECT 1, 'AM001', 'A';
INSERT INTO @data SELECT 2, 'AM001', 'A-';
INSERT INTO @data SELECT 3, 'AM001', 'A';
INSERT INTO @data SELECT 4, 'AM003', 'B';
INSERT INTO @data SELECT 6, 'AM003', 'B';
INSERT INTO @data SELECT 7, 'AM005', 'B';
INSERT INTO @data SELECT 8, 'AM005', 'B-';
INSERT INTO @data SELECT 9, 'AM005', 'B';
INSERT INTO @data SELECT 10, 'AM006', 'A';
INSERT INTO @data SELECT 11, 'AM006', 'A';
INSERT INTO @data SELECT 12, 'AM011', 'B';
INSERT INTO @data SELECT 13, 'AM011', 'B-';
INSERT INTO @data SELECT 14, 'AM011', 'B';
WITH NumberToRemove AS (
    SELECT
        Cate,
        COUNT(*) AS TakeOff
    FROM
        @data
    WHERE
        [Type] LIKE '_-'
    GROUP BY
        Cate),
Ordered AS (
    SELECT
        Id,
        Cate,
        [Type],
        ROW_NUMBER() OVER (PARTITION BY Cate ORDER BY Id) AS RowId
    FROM
        @data
    WHERE
        [Type] NOT LIKE '_-')
SELECT
    d.*
FROM
    @data d
    LEFT JOIN NumberToRemove m ON m.Cate = d.Cate
    INNER JOIN Ordered o ON o.Id = d.Id
WHERE
    o.RowId > ISNULL(m.TakeOff, 0);

The query works by first counting the number of rows to remove from each section (“Cate”) by tallying up the number of symbols with a minus sign per section. Next it sorts the rows where the symbols don’t have a minus sign and assigns each row a number in Id order (“row number”), starting back at 1 for each new section (“Cate”).

Finally I just pick the rows without a minus sign symbol, where the row number is greater than the number that were to be removed. Note that if a section has no rows to remove then it will return NULL rows to remove, so I transform this to 0, because ALL rows in that section with have a row number greater than 0.

My results were:

Id  Cate    Type
3   AM001   A
4   AM003   B
6   AM003   B
9   AM005   B
10  AM006   A
11  AM006   A
14  AM011   B

If my assumptions were incorrect then this script could easily be amended to suit…

0

solved T-SQL to generate expect result