代码之家  ›  专栏  ›  技术社区  ›  Finlay Weber

如何在DataFusion中为记录批次构造vec的vec值?

  •  0
  • Finlay Weber  · 技术社区  · 1 年前

    我可以创建类型为“UTF8”的列,如下所示

        let schema = Arc::new(Schema::new(vec![
            Field::new("id", DataType::Int32, false),
            Field::new("payload", DataType::Utf8, false),
        ]));
    
        let vec_of_strings: Vec<String> = vec!["one".to_string(), "two".to_string()];
        
        let batch = RecordBatch::try_new(
            schema,
            vec![
                Arc::new(Int32Array::from_slice([1, 2])),
                Arc::new(StringArray::from(vec_of_strings)),
            ],
        )?;
    
        ctx.register_batch("demo", batch)?;
    

    对此执行查询,如下所示

        let df = ctx.sql(r#"
           SELECT *
           from demo
        "#).await?;
    

    给出了预期结果

    +----+---------+
    | id | payload |
    +----+---------+
    | 1  | one       |
    | 2  | two      |
    +----+---------+
    

    现在我有一个用例,其中有效负载应该是一个数组。所以像这样的事情

    +----+---------+
    | id | payload |
    +----+---------+
    | 1  | [piano, guitar, drums]   |
    | 2  | [violin, piano]      |
    +----+---------+
    

    我该怎么办?

    更改 vec_of_strings vec_of_vecs 失败。我是认真的

        let vec_of_vecs: Vec<Vec<String>> = vec![
            vec!["piano".to_string(), "guitar".to_string(), "drums".to_string()],
            vec!["violin".to_string(), "guitar".to_string()]
        ];
    

    当用于创建这样的批时

        let batch = RecordBatch::try_new(
            schema,
            vec![
                Arc::new(Int32Array::from_slice([1, 2])),
                Arc::new(StringArray::from(vec_of_vecs)),
            ],
        )?;
    

    编译失败,出现错误

       |
    80 |             Arc::new(StringArray::from(vec_of_vecs)),
      |                      ----------------- ^^^^^^^^^^^ the trait `From<Vec<Vec<std::string::String>>>` is not implemented for `GenericByteArray<GenericStringType<i32>>`
      |                      |
      |                      required by a bound introduced by this call
      |
      = help: the following other types implement trait `From<T>`:
                <GenericByteArray<GenericBinaryType<OffsetSize>> as From<GenericByteArray<GenericStringType<OffsetSize>>>>
                <GenericByteArray<GenericBinaryType<OffsetSize>> as From<Vec<&[u8]>>>
                <GenericByteArray<GenericBinaryType<OffsetSize>> as From<Vec<Option<&[u8]>>>>
                <GenericByteArray<GenericBinaryType<T>> as From<GenericListArray<T>>>
                <GenericByteArray<GenericStringType<OffsetSize>> as From<GenericByteArray<GenericBinaryType<OffsetSize>>>>
                <GenericByteArray<GenericStringType<OffsetSize>> as From<GenericListArray<OffsetSize>>>
                <GenericByteArray<GenericStringType<OffsetSize>> as From<Vec<&str>>>
                <GenericByteArray<GenericStringType<OffsetSize>> as From<Vec<Option<&str>>>>
              and 3 others
    

    你知道我该如何实现上述目标吗?

    0 回复  |  直到 1 年前
    推荐文章