代码之家 › 专栏 › 技术社区 › David Ranieri

合并排序后更新列表->尾部

mergesort linked-list c

David Ranieri · 技术社区 · 4 年前

在双链表的实现中,我使用了典型的结构:

struct node
{
    void *data;
    struct node *prev;
    struct node *next;
};

我还将在O(1)时间内插入列表末尾,所以我还有另一个 struct 存储 head 以及 tail :

struct linklist
{
    struct node *head;
    struct node *tail;
    size_t size;
};

该程序对所有插入和删除操作都能按预期工作,但我对排序函数有问题,我使用的是合并排序算法,据我所知,这是对列表进行排序最有效或最有效的算法之一,该算法工作良好:

static struct node *split(struct node *head)
{
    struct node *fast = head;
    struct node *slow = head;

    while ((fast->next != NULL) && (fast->next->next != NULL))
    {
        fast = fast->next->next;
        slow = slow->next;
    }

    struct node *temp = slow->next;

    slow->next = NULL;
    return temp;
}

static struct node *merge(struct node *first, struct node *second, int (*comp)(const void *, const void *))
{
    if (first == NULL)
    {
        return second;
    }
    if (second == NULL)
    {
        return first;
    }
    if (comp(first->data, second->data) < 0)
    {
        first->next = merge(first->next, second, comp);
        first->next->prev = first;
        first->prev = NULL;
        return first;
    }
    else
    {
        second->next = merge(first, second->next, comp);
        second->next->prev = second;
        second->prev = NULL;
        return second;
    }
}

static struct node *merge_sort(struct node *head, int (*comp)(const void *, const void *))
{
    if ((head == NULL) || (head->next == NULL))
    {
        return head;
    }

    struct node *second = split(head);

    head = merge_sort(head, comp);
    second = merge_sort(second, comp);
    return merge(head, second, comp);
}

但我不知道如何保存 list->tail 已更新:

void linklist_sort(struct linklist *list, int (*comp)(const void *, const void *))
{
    list->head = merge_sort(list->head, comp);
    // list->tail is no longer valid at this point
}

当然,我可以在订购和更新后浏览整个列表 列表->尾 用蛮力,但我想知道是否有更花哨的方法。

我设法用循环列表解决了这个问题,但我想避免改变程序的结构。

0 回复 | 直到 4 年前

chqrlie 4 年前

你的算法通过递归使用O(N)堆栈空间 merge 每一步的功能。使用这种方法,跟踪 tail 节点。您只需扫描列表即可找到并更新 list 结构在 linklist_sort 这一额外步骤不会改变排序操作的复杂性。从当前值开始可以节省一些时间 link->tail :如果列表已经排序,循环将立即停止。

以下是修改后的版本:

void linklist_sort(struct linklist *list, int (*comp)(const void *, const void *)) {
    list->head = merge_sort(list->head, comp);
    if (list->tail) {
        struct node *tail = list->tail;
        while (tail->next)
            tail = tail->next;
        list->tail = tail;
    }
}

使用合并排序对链表进行排序应仅使用 O(对数(N)) 空间和 O(N log(N)) 时间。

以下是一些改进此算法的想法:

由于您知道列表的长度,因此不需要扫描整个列表进行拆分。您可以将长度与列表指针一起传递,并使用它来确定分割的位置,只扫描列表的一半。
如果你转换 合并 对于非递归版本,您可以跟踪合并阶段的最后一个节点并更新指针 struct node **tailp 作为参数传递,以指向最后一个节点。这将节省最后一次扫描,删除递归将降低空间复杂度。基准测试将告诉我们,这是否会提高效率尚不明显。
根据经验,使用指向列表节点的辅助数组N指针,可以更有效地对链表进行排序,无论是单独排序还是双链接排序。您将对这个数组进行排序,并根据排序后的数组的顺序重新链接节点。额外的要求是 O(N) 尺寸。

这是一个使用列表长度和非递归的修改版本 合并 :

struct node {
    void *data;
    struct node *prev;
    struct node *next;
};

struct linklist {
    struct node *head;
    struct node *tail;
    size_t size;
};

static struct node *split(struct node *head, size_t pos) {
    struct node *slow = head;

    while (pos-- > 1) {
        slow = slow->next;
    }
    struct node *temp = slow->next;
    slow->next = NULL;
    return temp;
}

static struct node *merge(struct node *first, struct node *second,
                          int (*comp)(const void *, const void *))
{
    struct node *head = NULL;
    struct node *prev = NULL;
    struct node **linkp = &head;

    for (;;) {
        if (first == NULL) {
            second->prev = prev;
            *linkp = second;
            break;
        }
        if (second == NULL) {
            first->prev = prev;
            *linkp = first;
            break;
        }
        if (comp(first->data, second->data)) <= 0 {
            first->prev = prev;
            prev = *linkp = first;
            linkp = &first->next;
        } else {
            second->prev = prev;
            prev = *linkp = second;
            linkp = &second->next;
        }
    }
    return head;
}

static struct node *merge_sort(struct node *head, size_t size,
                               int (*comp)(const void *, const void *))
{
    if (size < 2) {
        return head;
    }

    struct node *second = split(head, size / 2);

    head = merge_sort(head, size / 2, comp);
    second = merge_sort(second, size - size / 2, comp);
    return merge(head, second, comp);
}

void linklist_sort(struct linklist *list, int (*comp)(const void *, const void *)) {
    list->head = merge_sort(list->head, comp, list->size);
    if (list->tail) {
        struct node *tail = list->tail;
        while (tail->next)
            tail = tail->next;
        list->tail = tail;
    }
}

请注意,您还可以简化 合并 函数,在排序过程中不更新返回指针,因为您可以在上次扫描期间重新链接整个列表。最后一次扫描会更长,对缓存不太友好,但它仍然应该更高效,更不容易出错。

rcgldr 4 年前

一种选择是将节点合并排序,就像它们是单个列表节点一样,然后在完成时进行一次传递以设置前面的指针,并更新尾部指针。

另一种选择是使用类似于C++std::list和std::list::sort的东西。使用循环双链表。有一个虚拟节点将“next”用作“head”,将“prev”用作“tail”。合并排序和合并的参数是迭代器或指针,仅用于跟踪运行边界,因为节点是通过在原始列表中移动来合并的。merge函数使用std::list::splice将第二次运行中的节点合并到第一次运行中。逻辑是,如果第一个run元素小于或等于第二个run元素,只需将迭代器或指针前进到第一个run,否则从第二个runs中删除节点,并将其插入到第一个runs的当前节点之前。如果涉及删除+插入步骤,这将自动更新虚拟节点中的头指针和尾指针。

将结构节点更改为:

struct node
{
    struct node *next;           // used as head for dummy node
    struct node *prev;           // used as tail for dummy node
    void *data;
};

会更通用一些。

由于在创建列表时分配了虚拟节点,因此begin==dummy->;接下来,最后==虚拟->prev和end==虚拟。

Roberto Caboni Prince Rabadiya 4 年前

我不是提供有关算法的深入分析的最佳人选 海上舞台 符号。无论如何,用一个已经被接受的“经典”答案来回答一个问题是很好的,因为有可能在没有太大压力的情况下探索替代解决方案。
这很有趣,即使如你所见, 分析的解决方案并不比问题中提出的当前解决方案好 .

该策略首先想知道是否有可能在不颠倒代码的情况下跟踪候选尾部元素。主要候选函数是决定排序列表中节点顺序的函数: merge() 功能。

现在,由于在比较之后我们决定哪个节点将在排序列表中排名第一,我们将得到一个 “失败者” 那会更靠近尾巴。因此,通过与每个步骤的当前尾部元素进行进一步比较,最终我们将能够更新 tail 元素与 “失败者中的失败者” .

合并功能将具有额外的 struct node **tail 参数(双指针是必需的,因为我们将更改列表 尾 领域到位 :

static struct node *merge(struct node *first, struct node *second, struct node **tail, int (*comp)(const void *, const void *))
{
    if (first == NULL)
    {
        return second;
    }
    if (second == NULL)
    {
        return first;
    }
    if (comp(first->data, second->data) < 0)
    {
        first->next = merge(first->next, second, tail, comp);

        /* The 'second' node is the "loser". Let's compare current 'tail' 
           with it, and in case it loses again, let's update  'tail'.      */
        if( comp(second->data, (*tail)->data) > 0)
            *tail = second;
        /******************************************************************/

        first->next->prev = first;
        first->prev = NULL;
        return first;
    }
    else
    {
        second->next = merge(first, second->next, tail, comp);

        /* The 'first' node is the "loser". Let's compare current 'tail' 
           with it, and in case it loses again, let's update  'tail'.      */
        if( comp(first->data, (*tail)->data) > 0)
            *tail = first;
        /******************************************************************/

        second->next->prev = second;
        second->prev = NULL;
        return second;
    }
}

除了“传播”之外,不需要对代码进行更多更改 尾 双指针参数通过 merge_sort() 和 linklist_sort() 功能:

static struct node *merge_sort(struct node *head, struct node **tail, int (*comp)(const void *, const void *));

void linklist_sort(List_t *list, int (*comp)(const void *, const void *))
{
    list->head = merge_sort(list->head, &(list->tail), comp);
}

测试

为了测试这个修改,我必须编写一个基本的 insert() 功能,a compare() 函数旨在按降序获得排序列表,以及 printList() 公用事业。然后我写了一个主程序来测试所有的东西。

我做了几次测试;这里我只举一个例子,在这个例子中,我省略了问题和上述答案中的函数:

#include <stdio.h>

typedef struct node
{
    void *data;
    struct node *prev;
    struct node *next;
} Node_t;

typedef struct linklist
{
    struct node *head;
    struct node *tail;
    size_t size;
} List_t;

void insert(List_t *list, int data)
{
    Node_t * newnode = (Node_t *) malloc(sizeof(Node_t) );
    int * newdata = (int *) malloc(sizeof(int));
    *newdata = data;

    newnode->data = newdata;
    newnode->prev = list->tail;
    newnode->next = NULL;
    if(list->tail)
        list->tail->next = newnode;

    list->tail = newnode;

    if( list->size++ == 0 )
        list->head = newnode;   
}

int compare(const void *left, const void *right)
{
    if(!left && !right)
        return 0;

    if(!left && right)
        return 1;
    if(left && !right)
        return -1;

    int lInt = (int)*((int *)left), rInt = (int)*((int *)right);

    return (rInt-lInt); 
}

void printList( List_t *l)
{
    for(Node_t *n = l->head; n != NULL; n = n->next )
    {
        printf( " %d ->", *((int*)n->data));
    }
    printf( " NULL (tail=%d)\n", *((int*)l->tail->data));
}


int main(void)
{
  List_t l = { 0 };

  insert( &l, 5 );
  insert( &l, 3 );
  insert( &l, 15 );
  insert( &l, 11 );
  insert( &l, 2 );
  insert( &l, 66 );
  insert( &l, 77 );
  insert( &l, 4 );
  insert( &l, 13 );
  insert( &l, 9 );
  insert( &l, 23 );

  printList( &l );

  linklist_sort( &l, compare );

  printList( &l );

  /* Free-list utilities omitted */

  return 0;
}

在这个特定的测试中,我得到了以下输出:

 5 -> 3 -> 15 -> 11 -> 2 -> 66 -> 77 -> 4 -> 13 -> 9 -> 23 -> NULL (tail=23)
 77 -> 66 -> 23 -> 15 -> 13 -> 11 -> 9 -> 5 -> 4 -> 3 -> 2 -> NULL (tail=2)

结论

好消息是,从理论上讲,我们仍然有一个算法,在最坏的情况下 O(N log(N)) 时间复杂性。
坏消息是,为了避免在链表中进行线性搜索(N个“简单步骤”),我们必须 N*logN 比较,涉及对函数的调用。 这使得线性搜索仍然是一个更好的选择 .