代码之家  ›  专栏  ›  技术社区  ›  nickf

在PHP中解析日期字符串

  •  21
  • nickf  · 技术社区  · 14 年前

    例如,给定一个任意字符串( "I'm going to play croquet next Friday" "Gadzooks, is it 17th June already?" ),你如何从中提取日期?

    如果这看起来是一个很好的候选人太难篮子,也许你可以建议一个替代品。我想能够解析推特消息的日期。我要看的tweet是用户指向这个服务的tweet,因此可以指导他们使用更简单的格式,不过我希望它尽可能透明。你能想到一个好的中间立场吗?

    8 回复  |  直到 14 年前
        1
  •  12
  •   Dolph    14 年前

    如果你有马力,你可以 尝试 下面的算法。我举了一个例子,把枯燥的工作交给你:)

    //Attempt to perform strtotime() on each contiguous subset of words...
    
    //1st iteration
    strtotime("Gadzooks, is it 17th June already")
    strtotime("is it 17th June already")
    strtotime("it 17th June already")
    strtotime("17th June already")
    strtotime("June already")
    strtotime("already")
    
    //2nd iteration
    strtotime("Gadzooks, is it 17th June")
    strtotime("is it 17th June")
    strtotime("17th June") //date!
    strtotime("June") //date!
    
    //3rd iteration
    strtotime("Gadzooks, is it 17th")
    strtotime("is it 17th")
    strtotime("it 17th")
    strtotime("17th") //date!
    
    //4th iteration
    strtotime("Gadzooks, is it")
    //etc
    

    我们可以假设 strtotime("17th June") strtotime("17th")

        2
  •  6
  •   Scott Saunders    14 年前

    首先用strtotime()检查整个字符串是否是有效的日期。如果是这样,你就完了。

    循环每个n-1单词组合,并使用strotime()查看短语是否为有效日期。如果是这样,您已经在原始字符串中找到了最长的有效日期字符串。

    如果不是,则循环每个n-2单词组合,并使用strotime()查看短语是否为有效日期。如果是这样,您已经在原始字符串中找到了最长的有效日期字符串。

        3
  •  3
  •   Bing    5 年前

    受胡安·科尔特斯(juancortes)基于多尔夫算法的断链的启发,我自己写了出来。请注意,我决定在第一场成功的比赛中回来。

    <?php
    function extractDatetime($string) {
        if(strtotime($string)) return $string;
        $string = str_replace(array(" at ", " on ", " the "), " ", $string);
        if(strtotime($string)) return $string;
    
        $list = explode(" ", $string);
        $first_length = count($list);
        for($j=0; $j < $first_length; $j++) {
            $original_length = count($list);
            for($i=0; $i < $original_length; $i++) {
                $temp_list = $list;
                for($k = 0; $k < $i; $k++) unset($temp_list[$k]);
                //echo "<code>".implode(" ", $temp_list)."</code><br/>"; // for visualizing the tests, if you want to see it
                if(strtotime(implode(" ", $temp_list))) return implode(" ", $temp_list);
            }
            array_pop($list);
        }
    
        return false;
    }
    

    输入

    $array = array(
            "Gadzooks, is it 17th June already",
            "I’m going to play croquet next Friday",
            "Where was the dog yesterday at 6 PM?",
            "Where was Steve on Monday at 7am?"
    );
    
    foreach($array as $a) echo "$a => ".extractDatetime(str_replace("?", "", $a))."<hr/>";
    

    输出

    Gadzooks, is it 17th June already
    is it 17th June already
    it 17th June already
    17th June already
    June already
    already
    Gadzooks, is it 17th June
    is it 17th June
    it 17th June
    17th June
    Gadzooks, is it 17th June already => 17th June
    -----
    I’m going to play croquet next Friday
    going to play croquet next Friday
    to play croquet next Friday
    play croquet next Friday
    croquet next Friday
    next Friday
    I’m going to play croquet next Friday => next Friday
    -----
    Where was Rav Four yesterday 6 PM
    was Rav Four yesterday 6 PM
    Rav Four yesterday 6 PM
    Four yesterday 6 PM
    yesterday 6 PM
    Where was the Rav Four yesterday at 6 PM? => yesterday 6 PM
    -----
    Where was Steve Monday 7am
    was Steve Monday 7am
    Steve Monday 7am
    Monday 7am
    Where was Steve on Monday at 7am? => Monday 7am
    -----
    
        4
  •  2
  •   Babiker    14 年前

    可以 动手吧:

    $months = array(
                        "01" => "January", 
                        "02" => "Feberuary", 
                        "03" => "March", 
                        "04" => "April", 
                        "05" => "May", 
                        "06" => "June", 
                        "07" => "July", 
                        "08" => "August", 
                        "09" => "September", 
                        "10" => "October", 
                        "11" => "November", 
                        "12" => "December"
                    );
    
    $weekDays = array(
                        "01" => "Monday", 
                        "02" => "Tuesday", 
                        "03" => "Wednesday", 
                        "04" => "Thursday", 
                        "05" => "Friday", 
                        "06" => "Saturday", 
                        "07" => "Sunday"
                    );
    
    foreach($months as $value){
        if(strpos(strtolower($string),strtolower($value))){
            \\ extract and assign as you like...
        }
    }
    

    可能做另一个循环来检查其他工作日或其他格式,或者只是嵌套。

        5
  •  2
  •   Juan Cortés    14 年前

    strtotime php函数。

    当然,您需要设置一些规则来解析它们,因为您需要除去字符串上的所有额外内容,但除此之外,它是一个非常灵活的函数,在这里很可能会对您有所帮助。

    例如,它可以采用“nextfridy”和“June 15th”这样的字符串,并为字符串中的日期返回相应的UNIX时间戳。我想如果你考虑一些基本的规则,比如寻找“下一个X”和周和月的名字,你就能做到这一点。

    下周五 “来自” 下星期五我要打槌球 “你可以提取日期。看起来是个有趣的项目!但请记住 只接受英语短语,不适用于任何其他语言。

    $datestring = "I'm going to play croquet next Friday";
    
    $weekdays = array('monday','tuesday','wednesday',
                      'thursday','friday','saturday','sunday');
    
    foreach($weekdays as $weekday){
        if(strpos(strtolower($datestring),"next ".$weekday) !== false){
            echo date("F j, Y, g:i a",strtotime("next ".$weekday));
        }
    }
    

    这将返回字符串中提到的下一个工作日的日期,只要它遵循规则!在这个特殊的例子中,输出是 June 18, 2010, 12:00 am .

    就像已经指出的,有了正则表达式和一点耐心,你就可以做到这一点。编码最困难的部分是决定你要用什么方法来处理你的问题,而不是一旦你知道了什么就编码!

        6
  •  2
  •   Juan Cortés    14 年前

    多尔夫马修斯 F j, Y )我在网上写了一篇关于这件事的小帖子 Extracting a date from a string with PHP . 下面是两个示例字符串的输出:

    输入 : 下星期五我要打槌球

    Output: Array ( 
               [string] => "next friday",
               [unix] => 1276844400,
               [date] => "June 18, 2010" 
            )
    

    : 加佐克斯,现在已经是6月17日了吗?

    Output: Array ( 
               [string] => "17th june",
               [unix] => 1276758000,
               [date] => "June 17, 2010" 
            )
    

    我希望它能帮助别人。

        7
  •  2
  •   jes    8 年前

    基于 根据我的建议,我写了一个函数,我认为它可以达到这个目的。

    public function parse_date($text, $offset, $length){
    
      $parseArray = preg_split( "/[\s,.]/", $text);
      $dateTest = implode(" ", array_slice($parseArray, $offset, $length == 0 ? null : $length));
    
      $date = strtotime($dateTest);
    
      if ($date){
        return $date;
      }
    
      //make the string one word shorter in the front
      $offset++;
    
      //have we reached the end of the array?
      if($offset > count($parseArray)){
    
        //reset the start of the string
        $offset = 0;
    
        //trim the end by one
        $length--;
    
        //reached the very bottom with no date found
        if(abs($length) >= count($parseArray)){
          return false;
        }
      }
    
      //try to find the date with the new substring
      return $this->parse_date($text, $offset, $length);
    }
    

    你可以这样称呼它:

    parse_date('现在设置截止日期2017年1月5日',0,0)

        8
  •  1
  •   Matt    14 年前

    the Wikipedia article 开始吧。请记住,解析器可能会变得非常复杂,因为这实际上是一个语言识别问题。这是人工智能/计算语言学领域通常要解决的问题。

        9
  •  1
  •   Mikulas Dite    14 年前

    (\d{1,2})? 
    ((mon|tue|wed|thu|fri|sat|sun)|(monday|tuesday|wednesday|thursday|friday|saturday|sunday))?
    (\d{1,2})? (\d{2,4})?
    

    我跳过了几个月,因为我不确定我记得他们在正确的顺序。