【学习】ChatGPT对问答社区产生了哪些影响?

news2024/11/15 18:18:29

        引用 StackExchange 社区 CEO Prashanth Chandrasekar 的一篇博客标题 “Community is the future of AI”,引出本文的观点,即ChatGPT对问答社区产生了颠覆性影响,问答社区必须釜底抽薪、涅槃重生,但我们必须坚信“社区才是AI的未来”。

目录

一、影响

(一)新创建的问题数量在减少

1.结果数据

 2.折线图

3.对比

4.结论

(二)新答案

1.结果数据

 2.折线图

3.比较

4.结论

(三)新用户注册

 1.结果数据

2.折线图

3.对比

4.结论

二、思考


一、影响

        内容参考自 StackExchange 以下问题的答案(由用户 starball 回答):Did Stack Exchange's traffic go down since ChatGPT?icon-default.png?t=N658https://meta.stackexchange.com/questions/387278/did-stack-exchanges-traffic-go-down-since-chatgpt

        备注:结果来源于 Stack Exchange Data Explorer (SEDE) queries,统计数据从2018年开始,到当前博客写作日期的上个月(2023年6月),借用回答者的一句话:“2018 chosen somewhat arbitrarily - I just wanted some more context than looking at just 2022-2023”。

(一)新创建的问题数量在减少

        查询命令如下,与原始答案不同的是,这里添加了对统计结束时间的限制:

AND P.CreationDate < DATEFROMPARTS(2023, 7, 1) -- Limit to July 2023

/*-- INSTRUCTIONS:
    1)  Set the columns of #AllSiteResults to what you need in the final query.
    2)  Set the @seSiteQuery text (inside the WHILE loop) to the query that will run on each site to build
        the #AllSiteResults table.
    3)  Comment out the `WHERE       (dadn.dbName = 'StackExchange.Meta'...` line if site metas are desired.
    4)  Adjust the final query if post processing is desired (optional).
*/
DECLARE @seDbName       AS NVARCHAR (max)
DECLARE @seSiteURL      AS NVARCHAR (max)
DECLARE @sitePrettyName AS NVARCHAR (max)
DECLARE @seSiteQuery    AS NVARCHAR (max)

CREATE TABLE #AllSiteResults (
      -- PUT THE COLUMNS YOU WILL USE, HERE
      -- vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
      [Date] DATE,
      [NewQuestions] INT
      -- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)

DECLARE seSites_crsr CURSOR FOR
WITH dbsAndDomainNames AS (
    SELECT      dbL.dbName
                , STRING_AGG (dbL.domainPieces, '.')    AS siteDomain
    FROM (
        SELECT      TOP 50000   -- Never be that many sites and TOP is needed for order by, below
                    name        AS dbName
                    , value     AS domainPieces
                    , row_number ()  OVER (ORDER BY (SELECT 0)) AS [rowN]
        FROM        sys.databases
        CROSS APPLY STRING_SPLIT (name, '.')
        WHERE       CASE    WHEN state_desc = 'ONLINE'
                            THEN OBJECT_ID (QUOTENAME (name) + '.[dbo].[PostNotices]', 'U') -- Pick a table unique to SE data
                    END
                    IS NOT NULL
        ORDER BY    dbName, [rowN] DESC
    ) AS dbL
    GROUP BY    dbL.dbName
)
SELECT      REPLACE (REPLACE (dadn.dbName, 'StackExchange.', ''), '.', ' ' )  AS [Site Name]
            , dadn.dbName
            , CASE  -- See https://meta.stackexchange.com/q/215071
                    WHEN dadn.dbName = 'StackExchange.Mathoverflow.Meta'
                    THEN 'https://meta.mathoverflow.net/'
                    -- Some AVP/Audio/Video/Sound kerfuffle?
                    WHEN dadn.dbName = 'StackExchange.Audio'
                    THEN 'https://video.stackexchange.com/'
                    -- Ditto
                    WHEN dadn.dbName = 'StackExchange.Audio.Meta'
                    THEN 'https://video.meta.stackexchange.com/'
                    -- Normal site
                    ELSE 'https://' + LOWER (siteDomain) + '.com/'
            END AS siteURL
FROM        dbsAndDomainNames dadn
WHERE       (dadn.dbName = 'StackExchange.Meta'  OR  dadn.dbName NOT LIKE '%Meta%')

-- Step through cursor
OPEN    seSites_crsr
FETCH   NEXT FROM seSites_crsr INTO @sitePrettyName, @seDbName, @seSiteURL
WHILE   @@FETCH_STATUS = 0
BEGIN
    -- QUERY THAT YOU WANT TO RUN ON EACH SITE, GOES HERE
    -- For example:
    -- vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
    SET @seSiteQuery = '
        USE [' + @seDbName + ']
        INSERT INTO #AllSiteResults
        SELECT
            DATEFROMPARTS(YEAR(P.CreationDate), MONTH(P.CreationDate), 1) AS Date,
            COUNT(*) AS NewQuestions
        FROM Posts P
        WHERE P.PostTypeId = 1 -- Questions
            AND YEAR(P.CreationDate) >= 2018
            AND P.CreationDate < DATEFROMPARTS(2023, 7, 1) -- Limit to July 2023
        GROUP BY DATEFROMPARTS(YEAR(P.CreationDate), MONTH(P.CreationDate), 1)
    '
    -- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    EXEC sp_executesql @seSiteQuery

    FETCH NEXT FROM seSites_crsr INTO @sitePrettyName, @seDbName, @seSiteURL
END
CLOSE       seSites_crsr
DEALLOCATE  seSites_crsr

-- ADJUST THIS QUERY IF ANY POST PROCESSING IS DESIRED.
SELECT      MAX([Date]) AS Date, SUM([NewQuestions]) AS NewQuestions
FROM        #AllSiteResults
GROUP BY    [Date]
ORDER BY    [Date]

1.结果数据

        统计结果如下:

# 全网每月新问题数
Date	NewQuestions
2018/1/1 0:00	237404
2018/2/1 0:00	224143
2018/3/1 0:00	251429
2018/4/1 0:00	239636
2018/5/1 0:00	246762
2018/6/1 0:00	222615
2018/7/1 0:00	228026
2018/8/1 0:00	228371
2018/9/1 0:00	210906
2018/10/1 0:00	232536
2018/11/1 0:00	219300
2018/12/1 0:00	196715
2019/1/1 0:00	223493
2019/2/1 0:00	215514
2019/3/1 0:00	236543
2019/4/1 0:00	226815
2019/5/1 0:00	225382
2019/6/1 0:00	201296
2019/7/1 0:00	217943
2019/8/1 0:00	201189
2019/9/1 0:00	199442
2019/10/1 0:00	219626
2019/11/1 0:00	214303
2019/12/1 0:00	194623
2020/1/1 0:00	212131
2020/2/1 0:00	208687
2020/3/1 0:00	223977
2020/4/1 0:00	261901
2020/5/1 0:00	266095
2020/6/1 0:00	242444
2020/7/1 0:00	232102
2020/8/1 0:00	209958
2020/9/1 0:00	202312
2020/10/1 0:00	206767
2020/11/1 0:00	197732
2020/12/1 0:00	196899
2021/1/1 0:00	205239
2021/2/1 0:00	191901
2021/3/1 0:00	215623
2021/4/1 0:00	196910
2021/5/1 0:00	193489
2021/6/1 0:00	183171
2021/7/1 0:00	174020
2021/8/1 0:00	171779
2021/9/1 0:00	168816
2021/10/1 0:00	168644
2021/11/1 0:00	169028
2021/12/1 0:00	158997
2022/1/1 0:00	169548
2022/2/1 0:00	158851
2022/3/1 0:00	171107
2022/4/1 0:00	160153
2022/5/1 0:00	163367
2022/6/1 0:00	156387
2022/7/1 0:00	185296
2022/8/1 0:00	189989
2022/9/1 0:00	176730
2022/10/1 0:00	184835
2022/11/1 0:00	189896
2022/12/1 0:00	168000
2023/1/1 0:00	170610
2023/2/1 0:00	155088
2023/3/1 0:00	160060
2023/4/1 0:00	131668
2023/5/1 0:00	130464
2023/6/1 0:00	140041

2.折线图

        折线图如下:

3.对比

        StackOverflow 是 StackExchange 最大的网站,所以同时统计一下 StackOverflow 的情况做出对比。

        代码如下:

SELECT
  DATEFROMPARTS(YEAR(P.CreationDate), MONTH(P.CreationDate), 1) AS Date,
  COUNT(*) AS NewQuestions
FROM Posts P
WHERE P.PostTypeId = 1 -- Questions
  AND YEAR(P.CreationDate) >= 2018
  AND P.CreationDate < DATEFROMPARTS(2023, 7, 1)
GROUP BY DATEFROMPARTS(YEAR(P.CreationDate), MONTH(P.CreationDate), 1)
ORDER BY Date

         图如下:

        可以看到,StackOverflow 的趋势与 StackExchange 的整体趋势接近。

4.结论

        每年十二月份的活跃度通常会急剧下降。 2018年和2019年相当明显,但2020年和2021年就不那么明显了,2022年相当明显。所以很难说这到底是由于寒假导致的正常活动下降, 还是人们在 ChatGPT 而不是 Stack Exchange 上提问导致的。

        然而,近年来,1 月份的活动水平始终有所回升(2018-2022 年可观察到),但 2023 年 1 月则不然:新问题活动水平与 2022 年 12 月几乎持平,而且年后持续走低。

(二)新答案

         代码如下:

/*-- INSTRUCTIONS:
    1)  Set the columns of #AllSiteResults to what you need in the final query.
    2)  Set the @seSiteQuery text (inside the WHILE loop) to the query that will run on each site to build
        the #AllSiteResults table.
    3)  Comment out the `WHERE       (dadn.dbName = 'StackExchange.Meta'...` line if site metas are desired.
    4)  Adjust the final query if post processing is desired (optional).
*/
DECLARE @seDbName       AS NVARCHAR (max)
DECLARE @seSiteURL      AS NVARCHAR (max)
DECLARE @sitePrettyName AS NVARCHAR (max)
DECLARE @seSiteQuery    AS NVARCHAR (max)

CREATE TABLE #AllSiteResults (
      -- PUT THE COLUMNS YOU WILL USE, HERE
      -- vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
      [Date] DATE,
      [Type] NVARCHAR(max),
      [NewAnswers] REAL,
      [NewQuestions] REAL
      -- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)

DECLARE seSites_crsr CURSOR FOR
WITH dbsAndDomainNames AS (
    SELECT      dbL.dbName
                , STRING_AGG (dbL.domainPieces, '.')    AS siteDomain
    FROM (
        SELECT      TOP 50000   -- Never be that many sites and TOP is needed for order by, below
                    name        AS dbName
                    , value     AS domainPieces
                    , row_number ()  OVER (ORDER BY (SELECT 0)) AS [rowN]
        FROM        sys.databases
        CROSS APPLY STRING_SPLIT (name, '.')
        WHERE       CASE    WHEN state_desc = 'ONLINE'
                            THEN OBJECT_ID (QUOTENAME (name) + '.[dbo].[PostNotices]', 'U') -- Pick a table unique to SE data
                    END
                    IS NOT NULL
        ORDER BY    dbName, [rowN] DESC
    ) AS dbL
    GROUP BY    dbL.dbName
)
SELECT      REPLACE (REPLACE (dadn.dbName, 'StackExchange.', ''), '.', ' ' )  AS [Site Name]
            , dadn.dbName
            , CASE  -- See https://meta.stackexchange.com/q/215071
                    WHEN dadn.dbName = 'StackExchange.Mathoverflow.Meta'
                    THEN 'https://meta.mathoverflow.net/'
                    -- Some AVP/Audio/Video/Sound kerfuffle?
                    WHEN dadn.dbName = 'StackExchange.Audio'
                    THEN 'https://video.stackexchange.com/'
                    -- Ditto
                    WHEN dadn.dbName = 'StackExchange.Audio.Meta'
                    THEN 'https://video.meta.stackexchange.com/'
                    -- Normal site
                    ELSE 'https://' + LOWER (siteDomain) + '.com/'
            END AS siteURL
FROM        dbsAndDomainNames dadn
WHERE       (dadn.dbName = 'StackExchange.Meta'  OR  dadn.dbName NOT LIKE '%Meta%')

-- Step through cursor
OPEN    seSites_crsr
FETCH   NEXT FROM seSites_crsr INTO @sitePrettyName, @seDbName, @seSiteURL
WHILE   @@FETCH_STATUS = 0
BEGIN
    -- QUERY THAT YOU WANT TO RUN ON EACH SITE, GOES HERE
    -- For example:
    -- vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
    SET @seSiteQuery = '
        USE [' + @seDbName + ']
        INSERT INTO #AllSiteResults
        -- Ignore the datapoints for the current month. That data is not yet complete. Ex. Things like roomba have "lag"
        SELECT
          DATEFROMPARTS(YEAR(P.CreationDate), MONTH(P.CreationDate), 1) AS Date,
          ''average new non-deleted answers per new question'' AS Type,
          CAST(SUM(CASE WHEN P.PostTypeId = 2 THEN 1 ELSE 0 END) AS REAL) AS NewAnswerCount,
          CAST(SUM(CASE WHEN P.PostTypeId = 1 THEN 1 ELSE 0 END) AS REAL) AS NewQuestionCount
        FROM PostsWithDeleted P
        WHERE P.PostTypeId IN (1,2)
          AND DATEFROMPARTS(2017, 12, 1) < P.CreationDate
          AND P.CreationDate < DATEFROMPARTS(2023, 7, 1)
          AND (P.PostTypeId = 1 OR P.DeletionDate IS NULL)
        GROUP BY DATEFROMPARTS(YEAR(P.CreationDate), MONTH(P.CreationDate), 1)
        UNION
        SELECT
          DATEFROMPARTS(YEAR(P.CreationDate), MONTH(P.CreationDate), 1) AS Date,
          ''average new deleted answers per new question'' AS Type,
          CAST(SUM(CASE WHEN P.PostTypeId = 2 THEN 1 ELSE 0 END) AS REAL) AS NewAnswerCount,
          CAST(SUM(CASE WHEN P.PostTypeId = 1 THEN 1 ELSE 0 END) AS REAL) AS NewQuestionCount
        FROM PostsWithDeleted P
        WHERE P.PostTypeId IN (1,2)
          AND DATEFROMPARTS(2017, 12, 1) < P.CreationDate
          AND P.CreationDate < DATEFROMPARTS(2023, 7, 1)
          AND (P.PostTypeId = 1 OR P.DeletionDate IS NOT NULL)
        GROUP BY DATEFROMPARTS(YEAR(P.CreationDate), MONTH(P.CreationDate), 1)
        --ORDER BY Date
    '
    -- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    EXEC sp_executesql @seSiteQuery

    FETCH NEXT FROM seSites_crsr INTO @sitePrettyName, @seDbName, @seSiteURL
END
CLOSE       seSites_crsr
DEALLOCATE  seSites_crsr

-- ADJUST THIS QUERY IF ANY POST PROCESSING IS DESIRED.
SELECT      MAX([Date]) AS [Date], MAX([Type]) AS [Type], SUM([NewAnswers]) / SUM([NewQuestions]) AS [AverageNewAnswersPerNewQuestion]
FROM        #AllSiteResults
GROUP BY    [Date], [Type]
ORDER BY    [Date]

1.结果数据

 

# 全网每个新问题的平均答案数
Date	Type	AverageNewAnswersPerNewQuestion
2017/12/1 0:00	average new deleted answers per new question	0.183502
2017/12/1 0:00	average new non-deleted answers per new question	0.89314
2018/1/1 0:00	average new non-deleted answers per new question	0.905366
2018/1/1 0:00	average new deleted answers per new question	0.183494
2018/2/1 0:00	average new deleted answers per new question	0.183517
2018/2/1 0:00	average new non-deleted answers per new question	0.89258
2018/3/1 0:00	average new deleted answers per new question	0.183375
2018/3/1 0:00	average new non-deleted answers per new question	0.884344
2018/4/1 0:00	average new non-deleted answers per new question	0.888751
2018/4/1 0:00	average new deleted answers per new question	0.179318
2018/5/1 0:00	average new non-deleted answers per new question	0.88925
2018/5/1 0:00	average new deleted answers per new question	0.175033
2018/6/1 0:00	average new deleted answers per new question	0.173389
2018/6/1 0:00	average new non-deleted answers per new question	0.8859
2018/7/1 0:00	average new deleted answers per new question	0.176872
2018/7/1 0:00	average new non-deleted answers per new question	0.906406
2018/8/1 0:00	average new non-deleted answers per new question	0.931269
2018/8/1 0:00	average new deleted answers per new question	0.180633
2018/9/1 0:00	average new deleted answers per new question	0.180515
2018/9/1 0:00	average new non-deleted answers per new question	0.919448
2018/10/1 0:00	average new non-deleted answers per new question	0.897453
2018/10/1 0:00	average new deleted answers per new question	0.176562
2018/11/1 0:00	average new deleted answers per new question	0.170907
2018/11/1 0:00	average new non-deleted answers per new question	0.869077
2018/12/1 0:00	average new deleted answers per new question	0.181469
2018/12/1 0:00	average new non-deleted answers per new question	0.89007
2019/1/1 0:00	average new non-deleted answers per new question	0.901642
2019/1/1 0:00	average new deleted answers per new question	0.180455
2019/2/1 0:00	average new deleted answers per new question	0.172495
2019/2/1 0:00	average new non-deleted answers per new question	0.901606
2019/3/1 0:00	average new non-deleted answers per new question	0.884096
2019/3/1 0:00	average new deleted answers per new question	0.169558
2019/4/1 0:00	average new deleted answers per new question	0.164192
2019/4/1 0:00	average new non-deleted answers per new question	0.862334
2019/5/1 0:00	average new deleted answers per new question	0.160113
2019/5/1 0:00	average new non-deleted answers per new question	0.860274
2019/6/1 0:00	average new non-deleted answers per new question	0.867573
2019/6/1 0:00	average new deleted answers per new question	0.16234
2019/7/1 0:00	average new deleted answers per new question	0.163545
2019/7/1 0:00	average new non-deleted answers per new question	0.875567
2019/8/1 0:00	average new non-deleted answers per new question	0.887144
2019/8/1 0:00	average new deleted answers per new question	0.170328
2019/9/1 0:00	average new deleted answers per new question	0.16466
2019/9/1 0:00	average new non-deleted answers per new question	0.882504
2019/10/1 0:00	average new deleted answers per new question	0.162603
2019/10/1 0:00	average new non-deleted answers per new question	0.872577
2019/11/1 0:00	average new non-deleted answers per new question	0.87025
2019/11/1 0:00	average new deleted answers per new question	0.158717
2019/12/1 0:00	average new deleted answers per new question	0.166101
2019/12/1 0:00	average new non-deleted answers per new question	0.870477
2020/1/1 0:00	average new non-deleted answers per new question	0.875078
2020/1/1 0:00	average new deleted answers per new question	0.160808
2020/2/1 0:00	average new deleted answers per new question	0.153799
2020/2/1 0:00	average new non-deleted answers per new question	0.849026
2020/3/1 0:00	average new non-deleted answers per new question	0.7816
2020/3/1 0:00	average new deleted answers per new question	0.146717
2020/4/1 0:00	average new deleted answers per new question	0.147589
2020/4/1 0:00	average new non-deleted answers per new question	0.779073
2020/5/1 0:00	average new non-deleted answers per new question	0.800361
2020/5/1 0:00	average new deleted answers per new question	0.151292
2020/6/1 0:00	average new deleted answers per new question	0.151263
2020/6/1 0:00	average new non-deleted answers per new question	0.806941
2020/7/1 0:00	average new deleted answers per new question	0.157548
2020/7/1 0:00	average new non-deleted answers per new question	0.825327
2020/8/1 0:00	average new deleted answers per new question	0.15776
2020/8/1 0:00	average new non-deleted answers per new question	0.824721
2020/9/1 0:00	average new non-deleted answers per new question	0.790762
2020/9/1 0:00	average new deleted answers per new question	0.154112
2020/10/1 0:00	average new non-deleted answers per new question	0.764484
2020/10/1 0:00	average new deleted answers per new question	0.157146
2020/11/1 0:00	average new deleted answers per new question	0.153182
2020/11/1 0:00	average new non-deleted answers per new question	0.750182
2020/12/1 0:00	average new deleted answers per new question	0.158255
2020/12/1 0:00	average new non-deleted answers per new question	0.794868
2021/1/1 0:00	average new non-deleted answers per new question	0.798135
2021/1/1 0:00	average new deleted answers per new question	0.157774
2021/2/1 0:00	average new deleted answers per new question	0.153559
2021/2/1 0:00	average new non-deleted answers per new question	0.779712
2021/3/1 0:00	average new non-deleted answers per new question	0.758168
2021/3/1 0:00	average new deleted answers per new question	0.143849
2021/4/1 0:00	average new deleted answers per new question	0.147879
2021/4/1 0:00	average new non-deleted answers per new question	0.754431
2021/5/1 0:00	average new deleted answers per new question	0.148288
2021/5/1 0:00	average new non-deleted answers per new question	0.755367
2021/6/1 0:00	average new non-deleted answers per new question	0.765796
2021/6/1 0:00	average new deleted answers per new question	0.146673
2021/7/1 0:00	average new non-deleted answers per new question	0.776896
2021/7/1 0:00	average new deleted answers per new question	0.146639
2021/8/1 0:00	average new non-deleted answers per new question	0.792048
2021/8/1 0:00	average new deleted answers per new question	0.15187
2021/9/1 0:00	average new deleted answers per new question	0.150652
2021/9/1 0:00	average new non-deleted answers per new question	0.785845
2021/10/1 0:00	average new deleted answers per new question	0.147064
2021/10/1 0:00	average new non-deleted answers per new question	0.755685
2021/11/1 0:00	average new non-deleted answers per new question	0.744011
2021/11/1 0:00	average new deleted answers per new question	0.141775
2021/12/1 0:00	average new non-deleted answers per new question	0.764816
2021/12/1 0:00	average new deleted answers per new question	0.148233
2022/1/1 0:00	average new deleted answers per new question	0.146973
2022/1/1 0:00	average new non-deleted answers per new question	0.780104
2022/2/1 0:00	average new deleted answers per new question	0.137886
2022/2/1 0:00	average new non-deleted answers per new question	0.753624
2022/3/1 0:00	average new deleted answers per new question	0.131572
2022/3/1 0:00	average new non-deleted answers per new question	0.729329
2022/4/1 0:00	average new non-deleted answers per new question	0.734639
2022/4/1 0:00	average new deleted answers per new question	0.13609
2022/5/1 0:00	average new non-deleted answers per new question	0.728135
2022/5/1 0:00	average new deleted answers per new question	0.131584
2022/6/1 0:00	average new deleted answers per new question	0.137936
2022/6/1 0:00	average new non-deleted answers per new question	0.743591
2022/7/1 0:00	average new deleted answers per new question	0.138265
2022/7/1 0:00	average new non-deleted answers per new question	0.764211
2022/8/1 0:00	average new deleted answers per new question	0.140399
2022/8/1 0:00	average new non-deleted answers per new question	0.766521
2022/9/1 0:00	average new non-deleted answers per new question	0.757918
2022/9/1 0:00	average new deleted answers per new question	0.134554
2022/10/1 0:00	average new non-deleted answers per new question	0.738382
2022/10/1 0:00	average new deleted answers per new question	0.13114
2022/11/1 0:00	average new non-deleted answers per new question	0.718665
2022/11/1 0:00	average new deleted answers per new question	0.130706
2022/12/1 0:00	average new non-deleted answers per new question	0.73153
2022/12/1 0:00	average new deleted answers per new question	0.182844
2023/1/1 0:00	average new deleted answers per new question	0.163113
2023/1/1 0:00	average new non-deleted answers per new question	0.755667
2023/2/1 0:00	average new non-deleted answers per new question	0.743234
2023/2/1 0:00	average new deleted answers per new question	0.157383
2023/3/1 0:00	average new deleted answers per new question	0.156859
2023/3/1 0:00	average new non-deleted answers per new question	0.741384
2023/4/1 0:00	average new non-deleted answers per new question	0.720417
2023/4/1 0:00	average new deleted answers per new question	0.151795
2023/5/1 0:00	average new non-deleted answers per new question	0.732252
2023/5/1 0:00	average new deleted answers per new question	0.143678
2023/6/1 0:00	average new deleted answers per new question	0.120529
2023/6/1 0:00	average new non-deleted answers per new question	0.759169

 2.折线图

         在 StackExchange 全网中,变化趋势几乎同步,反映出 ChatGPT 的出现对老用户(老用户倾向于在社区中贡献答案)的影响不大。这个结论参考自于一篇论文,好像是《Reading Answers on Stack Overflow: Not Enough!》(DOI:10.1109/tse.2019.2954319)。

        比较显著的是从 2022 年 11 月到 2022 年 12 月,删除的回复帖子数量出现了前所未有的增加。所以值得思考,是什么导致用户删除回复的帖子数量的?是平台对 ChatGPT 生成的答案的限制?有这种可能,需要后续验证。

3.对比

        与 StackOverflow 做对比,代码如下:

SELECT
  DATEFROMPARTS(YEAR(P.CreationDate), MONTH(P.CreationDate), 1) AS Date,
  CASE WHEN P.DeletionDate IS NULL THEN 'non-deleted' ELSE 'deleted' END AS Status,
  COUNT(*) AS NewAnswers
FROM PostsWithDeleted P
WHERE P.PostTypeId = 2 -- Answers
  AND YEAR(P.CreationDate) >= 2018
  AND P.CreationDate < DATEFROMPARTS(2023, 7, 1)
GROUP BY
  DATEFROMPARTS(YEAR(P.CreationDate), MONTH(P.CreationDate), 1),
  CASE WHEN P.DeletionDate IS NULL THEN 'non-deleted' ELSE 'deleted' END
ORDER BY Date

        结果图如下:

         虽然近年来整体呈现下降趋势,但是,从2022年12月依赖,趋势变得更陡峭,占 StackExchange 大头的 StackOverflow 受 ChatGPT 的冲击是很大的。该图的趋势与新问题数量相似。

        对上面代码做如下修改:

FROM        dbsAndDomainNames dadn
-- WHERE       (dadn.dbName = 'StackExchange.Meta'  OR  dadn.dbName NOT LIKE '%Meta%')
WHERE       (dadn.dbName = 'StackOverflow') -- Only select Stack Overflow
-- Comment out the following line if site metas are desired
-- AND dadn.dbName NOT LIKE '%Meta%'
-- Step through cursor

         只获取 StackOverflow 的情况,图如下:

         可以看出,提出的问题被回答的情况明显的增加。

4.结论

        由于 StackOverflow 占大头,而这个论坛主要是面向技术人员的,由于 ChatGPT 的出现,很多问题不再需要在论坛中提问,所以导致 StackExchange 整体的活跃度下降。新答案数量也以同样的趋势在减少。但是所提问题的回答情况似乎没有受到影响,甚至在 StackOverflow 网站中,2023年以来,每个新的答案的平均回答率有上升趋势,答案的删除情况也明显的减少(这是进入后ChatGPT时代了吗?)原因还有待进一步考证。

(三)新用户注册

        代码如下:

SELECT
  DATEFROMPARTS(YEAR(U.CreationDate), MONTH(U.CreationDate), 1) AS Date,
  COUNT(*) AS NewUsers
FROM Users U
WHERE YEAR(U.CreationDate) >= 2018
  AND U.CreationDate < DATEFROMPARTS(2023, 7, 1)
GROUP BY
  DATEFROMPARTS(YEAR(U.CreationDate), MONTH(U.CreationDate), 1)
ORDER BY Date

 1.结果数据

# 每月新用户注册数
Date	NewUsers
2018/1/1 0:00	303064
2018/2/1 0:00	277977
2018/3/1 0:00	323594
2018/4/1 0:00	298977
2018/5/1 0:00	304450
2018/6/1 0:00	262515
2018/7/1 0:00	274961
2018/8/1 0:00	278287
2018/9/1 0:00	272128
2018/10/1 0:00	304007
2018/11/1 0:00	301353
2018/12/1 0:00	274968
2019/1/1 0:00	319298
2019/2/1 0:00	291886
2019/3/1 0:00	315879
2019/4/1 0:00	312456
2019/5/1 0:00	312536
2019/6/1 0:00	280361
2019/7/1 0:00	296166
2019/8/1 0:00	287561
2019/9/1 0:00	284323
2019/10/1 0:00	316200
2019/11/1 0:00	291018
2019/12/1 0:00	294891
2020/1/1 0:00	318868
2020/2/1 0:00	307204
2020/3/1 0:00	344408
2020/4/1 0:00	468303
2020/5/1 0:00	384514
2020/6/1 0:00	344951
2020/7/1 0:00	323656
2020/8/1 0:00	305353
2020/9/1 0:00	307687
2020/10/1 0:00	332683
2020/11/1 0:00	322922
2020/12/1 0:00	345420
2021/1/1 0:00	363082
2021/2/1 0:00	339855
2021/3/1 0:00	394545
2021/4/1 0:00	420813
2021/5/1 0:00	667550
2021/6/1 0:00	617327
2021/7/1 0:00	421782
2021/8/1 0:00	444984
2021/9/1 0:00	535107
2021/10/1 0:00	568112
2021/11/1 0:00	509507
2021/12/1 0:00	372878
2022/1/1 0:00	420221
2022/2/1 0:00	389280
2022/3/1 0:00	446719
2022/4/1 0:00	480977
2022/5/1 0:00	371546
2022/6/1 0:00	345780
2022/7/1 0:00	395201
2022/8/1 0:00	359470
2022/9/1 0:00	368080
2022/10/1 0:00	382385
2022/11/1 0:00	400144
2022/12/1 0:00	373079
2023/1/1 0:00	345450
2023/2/1 0:00	307249
2023/3/1 0:00	359899
2023/4/1 0:00	359313
2023/5/1 0:00	343078
2023/6/1 0:00	285215

 

2.折线图

        2018、2019年均在十二月有减少,但是在次年一月份就明显的回升。2020年没有这个规律,可能受疫情的影响很大,包括2021、2022年新用户明显的增加均可能受疫情的影响。2021年在12月减少,次年一月回暖,所以假设存在这样的规律。但是2022年12月下降之后,次年1月下降更多。

3.对比

        相比于全网,只关注于 StackOverflow 的话,2018 年 12 月新用户下降 13.1%,次年 1 月出现恢复。2019年11月至12月增长7.6%,次年1月进一步增长8.0%。2020 年,进入 12 月变化不大,次年 1 月增长 9.1%。2021年12月下降6.1%,次年1月恢复15.4%2022年,12月下降了5.8%,次年1月又下降了12.9%。

4.结论

        继 2022 年 12 月之后,2023 年 1 月新问题活动没有恢复,新用户注册量进一步下降,与往年趋势相比均呈下降趋势。但新问题的回答几乎没有受到影响。

        可能的影响因素:

        人们正在使用 ChatGPT 而不是 Stack Exchange。 这可能是一个重要的影响因素。

        在禁止使用 ChatGPT 的网站(例如 Stack Overflow)上写答案的人将被暂停,因此无法提出或回答问题。

        活动下降与 2023 年 1 月及随后几个月的重大科技裁员同时发生。 starball 认为不可能排除它是一个影响因素,尤其是 Stack Exchange 网络中大多数最大的站点都与技术相关。

二、思考

        这里引用 StackExchange 社区 CEO Prashanth Chandrasekar 博客 Community is the future of AIicon-default.png?t=N658https://stackoverflow.blog/2023/04/17/community-is-the-future-of-ai/中的一句话:

我一直在与不同经验水平的开发人员交谈,并且我一直听到新手程序员在人工智能的帮助下构建简单的网络应用程序的轶事。然而,这些故事中的大多数并不是以人工智能提示开始和结束的。相反,人工智能提供了一个起点和一些初始动力,而人类则进行额外的研究和学习来完成工作。人工智能可以调试一些错误,但会受到其他错误的阻碍。它可以建议良好的后端服务,但通常无法解决集成不同服务时出现的所有摩擦点。当然,当问题不是由机器指令而是人为错误造成时,最好的答案来自经历过相同问题的其他人。 

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/729090.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

慎用QGraphicsDropShadowEffect绘制阴影,会导致部分控件一直resizeEvent、重新绘制

我的程序还在创作中&#xff0c;代码还只是UI部分&#xff0c;数据都是固定的&#xff0c;也没有定时刷新之类代码&#xff0c;样式也只是使用了一小部分。有一天我发现我在QTableWidget添加自定义控件的时候&#xff0c;效应特别慢&#xff0c;而自定义控件只是在鼠标进入或离…

活动策划大揭秘:如何制定执行方案

对于刚转行做活动策划的小白&#xff0c;我对你的建议&#xff0c;就是两个字“借鉴”&#xff01; 小白要写出一份优秀的活动策划与执行方案&#xff0c;“借鉴”其实是唯一的方式。 而且而且越资深&#xff0c;借鉴的越多。 当我是小白的时候&#xff0c;我做一个案子只看…

vue3模型代码

效果&#xff1a; 代码 <template><div class"json_box"><json-viewer :value"jsonData" :boxed"false" :expand-depth"5" :expanded"true" ></json-viewer></div> </template><sc…

哈利波特!AI动画已经这么稳定了?MJ控制角色统一性5种技巧;百度大模型Prompt开发与应用新课上线;SD进阶万字长文 | ShowMeAI日报

&#x1f440;日报&周刊合集 | &#x1f3a1;生产力工具与行业应用大全 | &#x1f9e1; 点赞关注评论拜托啦&#xff01; &#x1f916; 哈利波特动画视频&#xff0c;使用 TemporalNet 制作 img2img 动画 这是 Reddit 论坛小伙伴分享的自制动画&#xff0c;内容选自哈利波…

Apikit 自学日记:添加测试步骤-脚本步骤

脚本步骤 在流程测试用例界面&#xff0c;进入用例管理&#xff0c;点击 添加脚本[Javascript] 按钮&#xff1a; 进入编辑用例页面&#xff0c;点击 新API测试 新建一个 API 请求。 API 自动化测试平台为代码模式的测试用例设计了一套简单的API信息模板&#xff0c;因此只需要…

d3dx9_33.dll丢失怎么解决

d3dx9_33.dll的作用 在讨论如何修复d3dx9_33.dll丢失错误之前&#xff0c;我们首先需要了解d3dx9_33.dll的作用。d3dx9_33.dll是DirectX 9的一个核心文件&#xff0c;它是DirectX库的一部分&#xff0c;用于提供图形和多媒体功能支持。DirectX是由Microsoft开发的一组多媒体技…

java项目根据启动位置指定 log4j2日志输出到指定目录

痛点 我们在开发的 java 项目一般都会记录日志&#xff0c;日志输出位置通常使用相对路径记录到当前启动 jar 包的同一个父文件夹下面。 当我们使用像 jsch、ssh 这种远程启动 java 程序的时候会出现一个问题&#xff1a; 日志会输出到当前登录用户的目录下面&#xff08;如…

机器学习洞察 | 降本增效,无服务器推理是怎么做到的?

2022 年&#xff0c;无服务器推理受到了越来越多的关注。常见的推理方式包括实时推理、批量转换和异步推理&#xff1a; 实时推理&#xff1a;具有低延迟、高吞吐、多模型部署的特点&#xff0c;能够满足 A/B 测试的需求 批量转换&#xff1a;能够基于任务 (Job-based) 的系统…

故障排查:通过ssh远程执行命令时报错未找到命令

博客主页&#xff1a;https://tomcat.blog.csdn.net 博主昵称&#xff1a;农民工老王 主要领域&#xff1a;Java、Linux、K8S 期待大家的关注&#x1f496;点赞&#x1f44d;收藏⭐留言&#x1f4ac; 目录 故障详情问题原因解决方案命令使用全路径修改~/.bashrc 故障详情 最近…

设计模式 - 抽象工厂模式

学完工厂模式&#xff0c;才发现还有一个抽象工厂模式&#xff1b;学习后发现不论是通过接口方式、还是继承方式&#xff0c;都可以使用抽象工厂模式&#xff1b;但是个人建议更多的时候&#xff0c;我们可以优先考虑接口方式&#xff0c;毕竟 单继承&#xff0c;多实现 设计模…

HTML5基础语法与标签

一、 HTML5介绍 HTML5是什么&#xff1f; HTML5是超文本标记语言&#xff08;HTML&#xff09;的第五个主要版本&#xff0c;用于描述网页结构和呈现内容。它是到目前为止最新且最强大的HTML版本。 HTML5语法约定 1.标签是HTML语法中的基本单位&#xff0c;由尖括号 ​<>…

QT分屏按钮

效果&#xff1a;按钮弹出分屏选择 // gridpopwidget.h #ifndef GRIDPOPWIDGET_H #define GRIDPOPWIDGET_H#include <QWidget> #include <QMouseEvent>class GridPopWidget : public QWidget {Q_OBJECT public:explicit GridPopWidget(QWidget *parent nullptr);~…

MySQL第二天

MySQL第二天 文章目录 MySQL第二天一、第一题 题目二、第二题题目 一、第一题 题目 1、先创建该customers表 create table customers ( c_num int primary key auto_increment, c_name varchar(50), c_contact varchar(50), c_city varchar(50),c_birth datetime not null);2、…

java IO流(一) File类

File对象只能对文件进行操作&#xff0c;不能操作文件中的内容。 1 File对象的创建 要注意的是&#xff1a;路径中"“要写成”\“进行转义&#xff0c; 路径中”/"可以直接用&#xff0c;但是最好的是使用File.separator&#xff0c;它会根据系统的不同进行转化&a…

ROS:分布式通信

目录 一、前言二、方案2.1准备2.2配置文件修改2.3配置主机IP2.4配置从机IP2.5测试 一、前言 ROS是一个分布式计算环境。一个运行中的ROS系统可以包含分布在多台计算机上多个节点。根据系统的配置方式&#xff0c;任何节点可能随时需要与任何其他节点进行通信。 因此&#xff…

小白开酒吧前要做好的三件事

一、进行市场调研当你有开酒吧的想法时&#xff0c;首先要做的第一步就是市场调研&#xff0c;进行市场调研可以让你了解到该地区酒吧市场是否良好&#xff0c;对未来的经营&#xff0c;有着决定成败的帮助&#xff0c;同时市场调研也可以让你了解到周边什么类型酒吧最受欢迎&a…

PMP证书有什么用,考试条件是什么?

很多关注项目经理岗位的朋友都知道&#xff0c;一些企业的招聘信息经常会发布&#xff0c;很多招聘项目经理岗/PMO岗的岗位要求中都会有一条&#xff1a;持有PMP/软考等证书的优先。 其实面试的时候&#xff0c;可能两个候选人的经历、经验、期望薪资都差不多&#xff0c;那么…

ESP32(掌控板) 图片显示与幻灯片

ESP32&#xff08;掌控板&#xff09; 图片显示与幻灯片 本程序通过按键改变变量的值&#xff0c;从而切换4组图片&#xff0c;通过触摸按键切换每组图片中的不同图片&#xff0c;同时按下两个按键开启幻灯片功能。 图形化程序如下 Arduino代码如下 /*!* MindPlus* mpython**…

sqoop笔记——一次从Hive到PostgreSql的数据迁移

写在开头 sqoop&#xff0c;想必进来围观的小伙伴们已经很熟悉了&#xff0c;笔者想把一些在实际使用sqoop过程中遇到的问题和注意事项记录并分析给大家&#xff0c;希望能帮助有需要的同学。随着对sqoop不断深入的了解&#xff0c;笔者会不断的以文章的形式记录并分析给大家&…

2023年05月份青少年软件编程Scratch图形化等级考试试卷二级真题(含答案)

2023-05 Scratch二级真题 题数&#xff1a;37 分数&#xff1a;100 测试时长&#xff1a;60min 一、单选题(共25题&#xff0c;共50分) 1.运行下列哪段程序&#xff0c;可以让狗狗走到木屋门口&#xff1f;&#xff08; &#xff09;(2分) A. B. C. D. 2.下列哪个选项…