This is the fourth day of my participation in the November Gwen Challenge. Check out the details: The last Gwen Challenge 2021

Full-text retrieval of sphinx search content is fine.

But after the search, there was still a bit of a problem with sorting articles. Well, I started by using reverse chronological sorting, so there was a slight problem that the results I wanted, or the ones closest to my search terms, didn’t appear in the first few pages. It’s not a good experience.

Then, I used the built-in function similar_text in PHP to calculate the similarity of the description and title of the article, and then sorted them in reverse order according to the similarity value after calculation.

I’ve encapsulated a function here: it’s just an example, but it depends on your needs

function similar_arr($array.$keyword.$arr_key = 'title')
{
    // Array similarity processing
    foreach ($array as $key= >$value) 
    {
        similar_text($value[$arr_key].$keyword.$percent);
        $value['percent'] = $percent;
        $data[] = $value;
         
    }
 
    // Take the percent column from the array and return a one-dimensional array
    $percent =  array_column($data.'percent');
 
    // Sort by percent
    array_multisort($percent, SORT_DESC, $data);
    
    return $data;
}

// $data is a two-dimensional array
$res = similar_arr($data.'wechat Mini Program');
var_dump($res);
Copy the code

This is fine, but it is not very friendly to Chinese similarity calculation. It’s a bit of a blind alley.

What can I do about it? You can’t use this thing either.

There are still a lot of aunts on Baidu. Here I found a class that calculates Chinese similarity. Here I changed it slightly:

The whole is as follows:

Lcscontroller.php


      
 
namespace App\Http\Controllers\index;
 
/ * * *@name: Article similarity calculation class *@author: camellia
 * @date: the 2021-03-04 * /
class LcsController extends BaseController
{
    private $str1;
    private $str2;
    private $c = array(a);/ * * *@name: returns the longest common subsequence * of strings one and two@author: camellia
     * @date: 2021-03-04 
     * @param:  data    type    description
     * @return: data    type    description
     */
    public function getLCS($str1.$str2.$len1 = 0.$len2 = 0)
    {
        $this->str1 = $str1;
        $this->str2 = $str2;
        if ($len1= =0) $len1 = strlen($str1);
        if ($len2= =0) $len2 = strlen($str2);
        $this->initC($len1.$len2);
        return $this->printLCS($this->c, $len1 - 1.$len2 - 1);
    }
    / * * *@name: Returns the similarity of two strings *@author: camellia
     * @date: 2021-03-04 
     * @param:  data    type    description
     * @return: data    type    description
     */
    public function getSimilar($str1.$str2)
    {
        $len1 = strlen($str1);
        $len2 = strlen($str2);
        $len = strlen($this->getLCS($str1.$str2.$len1.$len2));
        if(($len1 + $len2) > 0)
        {
            return $len * 2 / ($len1 + $len2);
        }
        else
        {
            return 0; }}/ * * *@name: Function name *@author: camellia
     * @date: 2021-03-04 
     * @param:  data    type    description
     * @return: data    type    description
     */
    public function initC($len1.$len2)
    {
        for ($i = 0; $i < $len1; $i{+ +)$this->c[$i] [0] = 0;
        }
        for ($j = 0; $j < $len2; $j{+ +)$this->c[0] [$j] = 0;
        }
        for ($i = 1; $i < $len1; $i{+ +)for ($j = 1; $j < $len2; $j{+ +)if ($this->str1[$i] = =$this->str2[$j]) 
                {
                    $this->c[$i] [$j] = $this->c[$i - 1] [$j - 1] + 1;
                } 
                else if ($this->c[$i - 1] [$j] > =$this->c[$i] [$j - 1]) 
                {
                    $this->c[$i] [$j] = $this->c[$i - 1] [$j];
                } 
                else 
                {
                    $this->c[$i] [$j] = $this->c[$i] [$j - 1]; }}}}/ * * *@name: Function name *@author: camellia
     * @date: 2021-03-04 
     * @param:  data    type    description
     * @return: data    type    description
     */
    public function printLCS($c.$i.$j)
    {
        if($i < 0 || $j < 0)
        {
            return "";
        }
        if ($i= =0 || $j= =0) 
        {
            if ($this->str1[$i] = =$this->str2[$j]) 
            {
                return $this->str2[$j];
            }
            else 
            {
                return ""; }}if ($this->str1[$i] = =$this->str2[$j]) 
        {
            return $this->printLCS($this->c, $i - 1.$j - 1).$this->str2[$j];
        }
        else if ($this->c[$i - 1] [$j] > =$this->c[$i] [$j - 1]) 
        {
            return $this->printLCS($this->c, $i - 1.$j);
        } 
        else 
        {
            return $this->printLCS($this->c, $i.$j - 1); }}}Copy the code

Call:

 / * * *@name: Sorts arrays by similarity *@author: camellia
     * @date: 2021-03-04 
     * @param:  data    type    description
     * @return: data    type    description
     */
    public function similar_arr($array.$keyword.$arr_key_one = 'arttitle'.$arr_key_two='content'.$arr_key_three= 'artdesc')
    {
        $lcs = new LcsController();
 
        // Array similarity processing
        foreach ($array as $key= >$value) {
            // Similar_text is not very friendly to Chinese similarity
            // similar_text($value[$arr_key], $keyword, $percent);
            $title_percent = $lcs->getSimilar($value[$arr_key_one].$keyword);
            // Returns the longest common subsequence
            //echo $lcs->getLCS("hello word","hello china");
            $value['title_percent'] = $title_percent;
            // $content_percent = $lcs->getSimilar($value[$arr_key_two], $keyword);
            // $value['content_percent'] = $content_percent;
            $desc_percent = $lcs->getSimilar($value[$arr_key_three].$keyword);
            $value['desc_percent'] = $desc_percent;
            $data[] = $value;
        }
 
        // Take the percent column from the array and return a one-dimensional array
        // $percent = array_column($data, 'percent');
        // Sort by percent
        // array_multisort($percent, SORT_DESC, $data);
        // $array = $this->sortArrByManyField($data, 'title_percent', SORT_DESC, 'content_percent', SORT_DESC, 'desc_percent', SORT_DESC);
        $array = $this->sortArrByManyField($data.'title_percent',SORT_DESC, 'id', SORT_DESC, 'desc_percent', SORT_DESC );
        return $array;
    }
    / * * *@name: a two-dimensional array of PHP sorts by multiple fields *@author: camellia
     * @date: 2021-03-04 
     * @param:  data    type    description
     * @return: data    type    description
     */
    public function sortArrByManyField()
    {
        $args = func_get_args(); // Get an array of function arguments
        if(empty($args))
        {
            return null;
        }
        $arr = array_shift($args);
        if(! is_array($arr))
        {
            throw new Exception("The first argument is not an array");
        }
        foreach($args as $key= >$field)
        {
            if(is_string($field)) {$temp = array(a);foreach($arr as $index= >$val) {$temp[$index] = $val[$field];
            }
            $args[$key] = $temp; }}$args[] = &$arr;/ / reference value
        call_user_func_array('array_multisort'.$args);
        return array_pop($args);
    }
Copy the code

Call the calculate similarity method

 $listShow = $this->similar_arr($list.$search.'arttitle');
Copy the code

The resulting similarity isn’t exactly accurate, but it’s better than the built-in similar_text function in PHP.

For specific results, please visit my personal blog: guanchao.site

For good suggestions, please enter your comments below.

Welcome to my blog guanchao.site

Welcome to applets: