kaggle学习笔记-otto-baseline4-本地CV的构建

news2024/12/23 13:37:30

总览

步骤 1 - 生成候选

对于每个测试用户,我们生成可能的选择,即候选人。、我们从 5 个来源生成候选人:

  • 点击、购物车、订单的用户历史记录
  • 测试周期间最受欢迎的 20 次点击、购物车、订单
  • 具有类型权重的点击/购物车/订单到购物车/订单的共同访问矩阵
  • 购物车/订单到购物车/订单的共同访问矩阵称为buy2buy
  • 点击/购物车/订单与点击的共同访问矩阵,具有时间加权

第 2 步 - 重新排名并选择 20

给定候选名单,我们必须选择 20 个作为我们的预测。在这本笔记本中,我们使用一组手工制定的规则来做到这一点。我们可以通过训练一个XGBoost模型来为我们选择,从而改进我们的预测。我们手工制定的规则优先考虑:

  • 最近以前访问过的项目
  • 以前多次访问过的项目
  • 以前在购物车或订单中的商品
  • 购物车/订单到购物车/订单的共同访问矩阵
  • 当前热门商品

在这里插入图片描述

第 1 步 - 使用 RAPIDS 生成候选者

对于候选生成,我们构建了三个共同访问矩阵。一个计算购物车/订单的受欢迎程度,给定用户之前的点击/购物车/订单。我们将类型权重应用于此矩阵。一个计算给定用户以前的购物车/订单的购物车/订单的受欢迎程度。我们称之为“buy2buy”矩阵。一个计算用户之前点击/购物车/订单的点击受欢迎程度。我们将时间权重应用于此矩阵。我们将使用 RAPIDS cuDF GPU 来快速计算这些矩阵

VER即version

VER = 6

import pandas as pd, numpy as np
from tqdm.notebook import tqdm
import os, sys, pickle, glob, gc
from collections import Counter
import cudf, itertools
print('We will use RAPIDS version',cudf.__version__)

使用 RAPIDS 计算三个共同访问矩阵

我们将在 GPU 上使用 RAPIDS cuDF 计算 3 个共同访问矩阵。要获得最大速度,请根据使用的 GPU 将变量DISK_PIECES设置为可能的最小数字,而不会引起内存错误。如果使用 32GB GPU 内存离线运行此代码,则可以使用 DISK_PIECES = 1 并在近 1 分钟内计算出每个共同访问矩阵!Kaggle 的 GPU 只有 16GB 内存,所以我们使用 DISK_PIECES = 4,每个都需要惊人的 3 分钟!以下是一些加快计算速度的技巧

  • 使用 RAPIDS cuDF GPU 代替 Pandas CPU
  • 读取磁盘一次并保存在 CPU RAM 中以供以后 GPU 多次使用
  • 一次在 GPU 上处理尽可能多的数据
  • 分两个阶段合并数据。多个小到单个介质。多个中型到单个大型。
  • 将结果写入parquet而不是字典
%%time
# CACHE FUNCTIONS
def read_file(f):
    return cudf.DataFrame( data_cache[f] )
def read_file_to_cache(f):
    df = pd.read_parquet(f)
    df.ts = (df.ts/1000).astype('int32')
    df['type'] = df['type'].map(type_labels).astype('int8')
    return df

# CACHE THE DATA ON CPU BEFORE PROCESSING ON GPU
data_cache = {}
type_labels = {'clicks':0, 'carts':1, 'orders':2}
files = glob.glob('../input/otto-validation/*_parquet/*')
for f in files: data_cache[f] = read_file_to_cache(f)

# CHUNK PARAMETERS
READ_CT = 5
CHUNK = int( np.ceil( len(files)/6 ))
print(f'We will process {len(files)} files, in groups of {READ_CT} and chunks of {CHUNK}.')

1) “推车订单”共同访问矩阵 - 类型加权

%%time
type_weight = {0:1, 1:6, 2:3}

# USE SMALLEST DISK_PIECES POSSIBLE WITHOUT MEMORY ERROR
DISK_PIECES = 4
SIZE = 1.86e6/DISK_PIECES

# COMPUTE IN PARTS FOR MEMORY MANGEMENT
for PART in range(DISK_PIECES):
    print()
    print('### DISK PART',PART+1)
    
    # MERGE IS FASTEST PROCESSING CHUNKS WITHIN CHUNKS
    # => OUTER CHUNKS
    for j in range(6):
        a = j*CHUNK
        b = min( (j+1)*CHUNK, len(files) )
        print(f'Processing files {a} thru {b-1} in groups of {READ_CT}...')
        
        # => INNER CHUNKS
        for k in range(a,b,READ_CT):
            # READ FILE
            df = [read_file(files[k])]
            for i in range(1,READ_CT): 
                if k+i<b: df.append( read_file(files[k+i]) )
            df = cudf.concat(df,ignore_index=True,axis=0)
            df = df.sort_values(['session','ts'],ascending=[True,False])
            # USE TAIL OF SESSION
            df = df.reset_index(drop=True)
            df['n'] = df.groupby('session').cumcount()
            df = df.loc[df.n<30].drop('n',axis=1)
            # CREATE PAIRS
            df = df.merge(df,on='session')
            df = df.loc[ ((df.ts_x - df.ts_y).abs()< 24 * 60 * 60) & (df.aid_x != df.aid_y) ]
            # MEMORY MANAGEMENT COMPUTE IN PARTS
            df = df.loc[(df.aid_x >= PART*SIZE)&(df.aid_x < (PART+1)*SIZE)]
            # ASSIGN WEIGHTS
            df = df[['session', 'aid_x', 'aid_y','type_y']].drop_duplicates(['session', 'aid_x', 'aid_y'])
            df['wgt'] = df.type_y.map(type_weight)
            df = df[['aid_x','aid_y','wgt']]
            df.wgt = df.wgt.astype('float32')
            df = df.groupby(['aid_x','aid_y']).wgt.sum()
            # COMBINE INNER CHUNKS
            if k==a: tmp2 = df
            else: tmp2 = tmp2.add(df, fill_value=0)
            print(k,', ',end='')
        print()
        # COMBINE OUTER CHUNKS
        if a==0: tmp = tmp2
        else: tmp = tmp.add(tmp2, fill_value=0)
        del tmp2, df
        gc.collect()
    # CONVERT MATRIX TO DICTIONARY
    tmp = tmp.reset_index()
    tmp = tmp.sort_values(['aid_x','wgt'],ascending=[True,False])
    # SAVE TOP 40
    tmp = tmp.reset_index(drop=True)
    tmp['n'] = tmp.groupby('aid_x').aid_y.cumcount()
    tmp = tmp.loc[tmp.n<15].drop('n',axis=1)
    # SAVE PART TO DISK (convert to pandas first uses less memory)
    tmp.to_pandas().to_parquet(f'top_15_carts_orders_v{VER}_{PART}.pqt')

运行结果
在这里插入图片描述

2)“Buy2Buy”共访矩阵

%%time
# USE SMALLEST DISK_PIECES POSSIBLE WITHOUT MEMORY ERROR
DISK_PIECES = 1
SIZE = 1.86e6/DISK_PIECES

# COMPUTE IN PARTS FOR MEMORY MANGEMENT
for PART in range(DISK_PIECES):
    print()
    print('### DISK PART',PART+1)
    
    # MERGE IS FASTEST PROCESSING CHUNKS WITHIN CHUNKS
    # => OUTER CHUNKS
    for j in range(6):
        a = j*CHUNK
        b = min( (j+1)*CHUNK, len(files) )
        print(f'Processing files {a} thru {b-1} in groups of {READ_CT}...')
        
        # => INNER CHUNKS
        for k in range(a,b,READ_CT):
            # READ FILE
            df = [read_file(files[k])]
            for i in range(1,READ_CT): 
                if k+i<b: df.append( read_file(files[k+i]) )
            df = cudf.concat(df,ignore_index=True,axis=0)
            df = df.loc[df['type'].isin([1,2])] # ONLY WANT CARTS AND ORDERS
            df = df.sort_values(['session','ts'],ascending=[True,False])
            # USE TAIL OF SESSION
            df = df.reset_index(drop=True)
            df['n'] = df.groupby('session').cumcount()
            df = df.loc[df.n<30].drop('n',axis=1)
            # CREATE PAIRS
            df = df.merge(df,on='session')
            df = df.loc[ ((df.ts_x - df.ts_y).abs()< 14 * 24 * 60 * 60) & (df.aid_x != df.aid_y) ] # 14 DAYS
            # MEMORY MANAGEMENT COMPUTE IN PARTS
            df = df.loc[(df.aid_x >= PART*SIZE)&(df.aid_x < (PART+1)*SIZE)]
            # ASSIGN WEIGHTS
            df = df[['session', 'aid_x', 'aid_y','type_y']].drop_duplicates(['session', 'aid_x', 'aid_y'])
            df['wgt'] = 1
            df = df[['aid_x','aid_y','wgt']]
            df.wgt = df.wgt.astype('float32')
            df = df.groupby(['aid_x','aid_y']).wgt.sum()
            # COMBINE INNER CHUNKS
            if k==a: tmp2 = df
            else: tmp2 = tmp2.add(df, fill_value=0)
            print(k,', ',end='')
        print()
        # COMBINE OUTER CHUNKS
        if a==0: tmp = tmp2
        else: tmp = tmp.add(tmp2, fill_value=0)
        del tmp2, df
        gc.collect()
    # CONVERT MATRIX TO DICTIONARY
    tmp = tmp.reset_index()
    tmp = tmp.sort_values(['aid_x','wgt'],ascending=[True,False])
    # SAVE TOP 40
    tmp = tmp.reset_index(drop=True)
    tmp['n'] = tmp.groupby('aid_x').aid_y.cumcount()
    tmp = tmp.loc[tmp.n<15].drop('n',axis=1)
    # SAVE PART TO DISK (convert to pandas first uses less memory)
    tmp.to_pandas().to_parquet(f'top_15_buy2buy_v{VER}_{PART}.pqt')

运行结果

We will process 146 files, in groups of 5 and chunks of 25.

### DISK PART 1
Processing files 0 thru 24 in groups of 5...
0 , 5 , 10 , 15 , 20 , 
Processing files 25 thru 49 in groups of 5...
25 , 30 , 35 , 40 , 45 , 
Processing files 50 thru 74 in groups of 5...
50 , 55 , 60 , 65 , 70 , 
Processing files 75 thru 99 in groups of 5...
75 , 80 , 85 , 90 , 95 , 
Processing files 100 thru 124 in groups of 5...
100 , 105 , 110 , 115 , 120 , 
Processing files 125 thru 145 in groups of 5...
125 , 130 , 135 , 140 , 145 , 

### DISK PART 2
Processing files 0 thru 24 in groups of 5...
0 , 5 , 10 , 15 , 20 , 
Processing files 25 thru 49 in groups of 5...
25 , 30 , 35 , 40 , 45 , 
Processing files 50 thru 74 in groups of 5...
50 , 55 , 60 , 65 , 70 , 
Processing files 75 thru 99 in groups of 5...
75 , 80 , 85 , 90 , 95 , 
Processing files 100 thru 124 in groups of 5...
100 , 105 , 110 , 115 , 120 , 
Processing files 125 thru 145 in groups of 5...
125 , 130 , 135 , 140 , 145 , 

### DISK PART 3
Processing files 0 thru 24 in groups of 5...
0 , 5 , 10 , 15 , 20 , 
Processing files 25 thru 49 in groups of 5...
25 , 30 , 35 , 40 , 45 , 
Processing files 50 thru 74 in groups of 5...
50 , 55 , 60 , 65 , 70 , 
Processing files 75 thru 99 in groups of 5...
75 , 80 , 85 , 90 , 95 , 
Processing files 100 thru 124 in groups of 5...
100 , 105 , 110 , 115 , 120 , 
Processing files 125 thru 145 in groups of 5...
125 , 130 , 135 , 140 , 145 , 

### DISK PART 4
Processing files 0 thru 24 in groups of 5...
0 , 5 , 10 , 15 , 20 , 
Processing files 25 thru 49 in groups of 5...
25 , 30 , 35 , 40 , 45 , 
Processing files 50 thru 74 in groups of 5...
50 , 55 , 60 , 65 , 70 , 
Processing files 75 thru 99 in groups of 5...
75 , 80 , 85 , 90 , 95 , 
Processing files 100 thru 124 in groups of 5...
100 , 105 , 110 , 115 , 120 , 
Processing files 125 thru 145 in groups of 5...
125 , 130 , 135 , 140 , 145 , 

### DISK PART 1
Processing files 0 thru 24 in groups of 5...
0 , 5 , 10 , 15 , 20 , 
Processing files 25 thru 49 in groups of 5...
25 , 30 , 35 , 40 , 45 , 
Processing files 50 thru 74 in groups of 5...
50 , 55 , 60 , 65 , 70 , 
Processing files 75 thru 99 in groups of 5...
75 , 80 , 85 , 90 , 95 , 
Processing files 100 thru 124 in groups of 5...
100 , 105 , 110 , 115 , 120 , 
Processing files 125 thru 145 in groups of 5...
125 , 130 , 135 , 140 , 145 , 

3) “点击”共同访问矩阵 - 时间加权

%%time
# USE SMALLEST DISK_PIECES POSSIBLE WITHOUT MEMORY ERROR
DISK_PIECES = 4
SIZE = 1.86e6/DISK_PIECES

# COMPUTE IN PARTS FOR MEMORY MANGEMENT
for PART in range(DISK_PIECES):
    print()
    print('### DISK PART',PART+1)
    
    # MERGE IS FASTEST PROCESSING CHUNKS WITHIN CHUNKS
    # => OUTER CHUNKS
    for j in range(6):
        a = j*CHUNK
        b = min( (j+1)*CHUNK, len(files) )
        print(f'Processing files {a} thru {b-1} in groups of {READ_CT}...')
        
        # => INNER CHUNKS
        for k in range(a,b,READ_CT):
            # READ FILE
            df = [read_file(files[k])]
            for i in range(1,READ_CT): 
                if k+i<b: df.append( read_file(files[k+i]) )
            df = cudf.concat(df,ignore_index=True,axis=0)
            df = df.sort_values(['session','ts'],ascending=[True,False])
            # USE TAIL OF SESSION
            df = df.reset_index(drop=True)
            df['n'] = df.groupby('session').cumcount()
            df = df.loc[df.n<30].drop('n',axis=1)
            # CREATE PAIRS
            df = df.merge(df,on='session')
            df = df.loc[ ((df.ts_x - df.ts_y).abs()< 24 * 60 * 60) & (df.aid_x != df.aid_y) ]
            # MEMORY MANAGEMENT COMPUTE IN PARTS
            df = df.loc[(df.aid_x >= PART*SIZE)&(df.aid_x < (PART+1)*SIZE)]
            # ASSIGN WEIGHTS
            df = df[['session', 'aid_x', 'aid_y','ts_x']].drop_duplicates(['session', 'aid_x', 'aid_y'])
            df['wgt'] = 1 + 3*(df.ts_x - 1659304800)/(1662328791-1659304800)
            df = df[['aid_x','aid_y','wgt']]
            df.wgt = df.wgt.astype('float32')
            df = df.groupby(['aid_x','aid_y']).wgt.sum()
            # COMBINE INNER CHUNKS
            if k==a: tmp2 = df
            else: tmp2 = tmp2.add(df, fill_value=0)
            print(k,', ',end='')
        print()
        # COMBINE OUTER CHUNKS
        if a==0: tmp = tmp2
        else: tmp = tmp.add(tmp2, fill_value=0)
        del tmp2, df
        gc.collect()
    # CONVERT MATRIX TO DICTIONARY
    tmp = tmp.reset_index()
    tmp = tmp.sort_values(['aid_x','wgt'],ascending=[True,False])
    # SAVE TOP 40
    tmp = tmp.reset_index(drop=True)
    tmp['n'] = tmp.groupby('aid_x').aid_y.cumcount()
    tmp = tmp.loc[tmp.n<20].drop('n',axis=1)
    # SAVE PART TO DISK (convert to pandas first uses less memory)
    tmp.to_pandas().to_parquet(f'top_20_clicks_v{VER}_{PART}.pqt')
# FREE MEMORY
del data_cache, tmp
_ = gc.collect()

第 2 步 - 使用手工制定的规则重新排名(选择 20)

def load_test():    
    dfs = []
    for e, chunk_file in enumerate(glob.glob('../input/otto-validation/test_parquet/*')):
        chunk = pd.read_parquet(chunk_file)
        chunk.ts = (chunk.ts/1000).astype('int32')
        chunk['type'] = chunk['type'].map(type_labels).astype('int8')
        dfs.append(chunk)
    return pd.concat(dfs).reset_index(drop=True) #.astype({"ts": "datetime64[ms]"})

test_df = load_test()
print('Test data has shape',test_df.shape)
test_df.head()
%%time
# LOAD THREE CO-VISITATION MATRICES
def pqt_to_dict(df):
    return df.groupby('aid_x').aid_y.apply(list).to_dict()

top_20_clicks = pqt_to_dict( pd.read_parquet(f'top_20_clicks_v{VER}_0.pqt') )
for k in range(1,DISK_PIECES): 
    top_20_clicks.update( pqt_to_dict( pd.read_parquet(f'top_20_clicks_v{VER}_{k}.pqt') ) )
top_20_buys = pqt_to_dict( pd.read_parquet(f'top_15_carts_orders_v{VER}_0.pqt') )
for k in range(1,DISK_PIECES): 
    top_20_buys.update( pqt_to_dict( pd.read_parquet(f'top_15_carts_orders_v{VER}_{k}.pqt') ) )
top_20_buy2buy = pqt_to_dict( pd.read_parquet(f'top_15_buy2buy_v{VER}_0.pqt') )

# TOP CLICKS AND ORDERS IN TEST
top_clicks = test_df.loc[test_df['type']=='clicks','aid'].value_counts().index.values[:20]
top_orders = test_df.loc[test_df['type']=='orders','aid'].value_counts().index.values[:20]

print('Here are size of our 3 co-visitation matrices:')
print( len( top_20_clicks ), len( top_20_buy2buy ), len( top_20_buys ) )
#type_weight_multipliers = {'clicks': 1, 'carts': 6, 'orders': 3}
type_weight_multipliers = {0: 1, 1: 6, 2: 3}

def suggest_clicks(df):
    # USE USER HISTORY AIDS AND TYPES
    aids=df.aid.tolist()
    types = df.type.tolist()
    unique_aids = list(dict.fromkeys(aids[::-1] ))
    # RERANK CANDIDATES USING WEIGHTS
    if len(unique_aids)>=20:
        weights=np.logspace(0.1,1,len(aids),base=2, endpoint=True)-1
        aids_temp = Counter() 
        # RERANK BASED ON REPEAT ITEMS AND TYPE OF ITEMS
        for aid,w,t in zip(aids,weights,types): 
            aids_temp[aid] += w * type_weight_multipliers[t]
        sorted_aids = [k for k,v in aids_temp.most_common(20)]
        return sorted_aids
    # USE "CLICKS" CO-VISITATION MATRIX
    aids2 = list(itertools.chain(*[top_20_clicks[aid] for aid in unique_aids if aid in top_20_clicks]))
    # RERANK CANDIDATES
    top_aids2 = [aid2 for aid2, cnt in Counter(aids2).most_common(20) if aid2 not in unique_aids]    
    result = unique_aids + top_aids2[:20 - len(unique_aids)]
    # USE TOP20 TEST CLICKS
    return result + list(top_clicks)[:20-len(result)]

def suggest_buys(df):
    # USE USER HISTORY AIDS AND TYPES
    aids=df.aid.tolist()
    types = df.type.tolist()
    # UNIQUE AIDS AND UNIQUE BUYS
    unique_aids = list(dict.fromkeys(aids[::-1] ))
    df = df.loc[(df['type']==1)|(df['type']==2)]
    unique_buys = list(dict.fromkeys( df.aid.tolist()[::-1] ))
    # RERANK CANDIDATES USING WEIGHTS
    if len(unique_aids)>=20:
        weights=np.logspace(0.5,1,len(aids),base=2, endpoint=True)-1
        aids_temp = Counter() 
        # RERANK BASED ON REPEAT ITEMS AND TYPE OF ITEMS
        for aid,w,t in zip(aids,weights,types): 
            aids_temp[aid] += w * type_weight_multipliers[t]
        # RERANK CANDIDATES USING "BUY2BUY" CO-VISITATION MATRIX
        aids3 = list(itertools.chain(*[top_20_buy2buy[aid] for aid in unique_buys if aid in top_20_buy2buy]))
        for aid in aids3: aids_temp[aid] += 0.1
        sorted_aids = [k for k,v in aids_temp.most_common(20)]
        return sorted_aids
    # USE "CART ORDER" CO-VISITATION MATRIX
    aids2 = list(itertools.chain(*[top_20_buys[aid] for aid in unique_aids if aid in top_20_buys]))
    # USE "BUY2BUY" CO-VISITATION MATRIX
    aids3 = list(itertools.chain(*[top_20_buy2buy[aid] for aid in unique_buys if aid in top_20_buy2buy]))
    # RERANK CANDIDATES
    top_aids2 = [aid2 for aid2, cnt in Counter(aids2+aids3).most_common(20) if aid2 not in unique_aids] 
    result = unique_aids + top_aids2[:20 - len(unique_aids)]
    # USE TOP20 TEST ORDERS
    return result + list(top_orders)[:20-len(result)]

创建提交 CSV

%%time
pred_df_clicks = test_df.sort_values(["session", "ts"]).groupby(["session"]).apply(
    lambda x: suggest_clicks(x)
)

pred_df_buys = test_df.sort_values(["session", "ts"]).groupby(["session"]).apply(
    lambda x: suggest_buys(x)
)
clicks_pred_df = pd.DataFrame(pred_df_clicks.add_suffix("_clicks"), columns=["labels"]).reset_index()
orders_pred_df = pd.DataFrame(pred_df_buys.add_suffix("_orders"), columns=["labels"]).reset_index()
carts_pred_df = pd.DataFrame(pred_df_buys.add_suffix("_carts"), columns=["labels"]).reset_index()
pred_df = pd.concat([clicks_pred_df, orders_pred_df, carts_pred_df])
pred_df.columns = ["session_type", "labels"]
pred_df["labels"] = pred_df.labels.apply(lambda x: " ".join(map(str,x)))
pred_df.to_csv("validation_preds.csv", index=False)
pred_df.head()

计算验证分数

# FREE MEMORY
del pred_df_clicks, pred_df_buys, clicks_pred_df, orders_pred_df, carts_pred_df
del top_20_clicks, top_20_buy2buy, top_20_buys, top_clicks, top_orders, test_df
_ = gc.collect()
%%time
# COMPUTE METRIC
score = 0
weights = {'clicks': 0.10, 'carts': 0.30, 'orders': 0.60}
for t in ['clicks','carts','orders']:
    sub = pred_df.loc[pred_df.session_type.str.contains(t)].copy()
    sub['session'] = sub.session_type.apply(lambda x: int(x.split('_')[0]))
    sub.labels = sub.labels.apply(lambda x: [int(i) for i in x.split(' ')[:20]])
    test_labels = pd.read_parquet('../input/otto-validation/test_labels.parquet')
    test_labels = test_labels.loc[test_labels['type']==t]
    test_labels = test_labels.merge(sub, how='left', on=['session'])
    test_labels['hits'] = test_labels.apply(lambda df: len(set(df.ground_truth).intersection(set(df.labels))), axis=1)
    test_labels['gt_count'] = test_labels.ground_truth.str.len().clip(0,20)
    recall = test_labels['hits'].sum() / test_labels['gt_count'].sum()
    score += weights[t]*recall
    print(f'{t} recall =',recall)
    
print('=============')
print('Overall Recall =',score)
print('=============')

运行结果

clicks recall = 0.5255597442145808
carts recall = 0.4093328152483512
orders recall = 0.6487936598117477
=============
Overall Recall = 0.5646320148830121
=============

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/124408.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

中国芯片奇才,仅用三年打破欧美垄断,创造奇迹

有这么一位中国人&#xff0c;打破了欧美长达10年的芯片技术垄断。这位最该追的星&#xff0c;她是谁&#xff1f;又是如何让欧美芯片领域闻风丧胆。早在2017年&#xff0c;芯片国产化已接近50%&#xff0c;然而25g以上芯片却仅有3%&#xff0c;该技术一直掌握在欧美等发达国家…

融云任杰:激活组织生命力 让听见炮火的人做决策 | TGO专访

任杰&#xff0c;融云联合创始人兼首席科学家&#xff0c;TGO 鲲鹏会&#xff08;北京&#xff09;学员&#xff1b;曾就职于微软和神州泰岳等公司&#xff0c;在微软两次获得全球杰出员工奖&#xff0c;曾负责中国联通搭建 WAP 网关、增值业务管理平台&#xff1b;在神州泰岳期…

数据结构(线性表及顺序表)

目录 线性表 线性结构定义 常见线性结构 线性表 顺序表及其实现 顺序结构 顺序表的存储映像图 顺序表seqList及操作的定义&#xff08;seqList.h&#xff09; 顺序表基本操作的实现分析 查找操作 实现代码 插入操作 实现代码 删除操作 实现代码 顺序表应用——…

手绘图说电子元器件-集成电路

集成电路是高度集成化的电子器件,具有集成度高、功能完整、可靠性好、体积小、重量轻、功耗低的特点,已成为现代电子技术中不可或缺的核心器件。 集成电路可分为模拟集成电路和数字集成电路两大类,包括集成运放、时基集成电路、集成稳压器、门电路、触发器、计数器、译码器…

【 uniapp - 黑马优购 | 分类界面 】创建cate分支、数据获取、动态渲染

个人名片&#xff1a; &#x1f43c;作者简介&#xff1a;一名大二在校生&#xff0c;讨厌编程&#x1f38b; &#x1f43b;‍❄️个人主页&#x1f947;&#xff1a;小新爱学习. &#x1f43c;个人WeChat&#xff1a;hmmwx53 &#x1f54a;️系列专栏&#xff1a;&#x1f5bc…

【Lilishop商城】No4-4.业务逻辑的代码开发,涉及到:会员B端第三方登录的开发-web端第三方授权联合登录接口开发

仅涉及后端&#xff0c;全部目录看顶部专栏&#xff0c;代码、文档、接口路径在&#xff1a; 【Lilishop商城】记录一下B2B2C商城系统学习笔记~_清晨敲代码的博客-CSDN博客 全篇会结合业务介绍重点设计逻辑&#xff0c;其中重点包括接口类、业务类&#xff0c;具体的结合源代…

机器学习——支持向量机(SVM)

文章目录1. 优化目标2. 大间距的直观理解3. 大间距分类背后的数学支持向量机&#xff08;Support Vector Machines&#xff09;是广泛应用于工业界和学术界的一种监督学习算法&#xff0c;在学习复杂的非线性方程时提供了一种更为清晰&#xff0c;更加强大的方式。下面从SVM的优…

【nowcoder】笔试强训Day10

目录 一、选择题 二、编程题 2.1井字棋 2.2密码强度等级 一、选择题 1.下列运算符合法的是&#xff08; &#xff09; A. && B. <> C. if D. : 逻辑与&&语法规则&#xff1a;表达式1&&表达式2&#xff0c;其中两个表达式都是布尔表达式…

LeetCode453.最小操作次数使数组元素相等

LeetCode刷题记录 文章目录&#x1f4dc;题目描述&#x1f4a1;解题思路&#x1f4dc;题目描述 给你一个长度为 n 的整数数组&#xff0c;每次操作将会使 n - 1 个元素增加 1 。 返回让数组所有元素相等的最小操作次数。 示例1 输入&#xff1a;nums [1,2,3] 输出&#xff1a…

EduIQ Network LookOut Administrator

EduIQ Network LookOut Administrator 网络了望管理员允许您从屏幕的远端实时监视计算机。您可以随时看到他们在做什么。除了计算机控制&#xff0c;您还可以捕获摩西和键盘用户。使用Network LookOut Administrator软件可以完成一些有用的工作&#xff1a; 远程儿童计算机控制…

记录一次Gstreamer运行报错排查

背景 系统&#xff1a;Ubuntu 20.04 显卡型号&#xff1a;RTX 2060 之前正常运行的Gstreamer的编解码代码&#xff08;有用到显卡硬件加速&#xff09;&#xff0c;突然运行报错。经过一番折腾&#xff0c;最终找到原因&#xff0c;是因为NVIDIA驱动近期更新了&#xff0c;与…

Node.js——模块化(一)

1. 模块化的基本概念 1.1 什么是模块化 1. 编程领域中的模块化 1.2 模块化规范 2. Node.js 中的模块化 2.1 Node.js 中模块的分类 2.2 加载模块 加载自定义模块给相对路径 ./是平级&#xff08;同一文件夹下调用&#xff09; //这是07.test.js代码 // 注意&#xff1a;在使用…

Cypress笔记-连接命令

.each() 作用 遍历数组数据结构&#xff08;具有 length 属性的数组或对象&#xff09; cy.get(.connectors-each-ul>li).each(function($el, index, $list){ //遍历每个li元素console.log($el, index, $list)}).its() 作用 获取对象的属性值 示例代码 cy.get(.conne…

MergeTree概述

概述 Clickhouse 中最强大的表引擎当属 MergeTree &#xff08;合并树&#xff09;引擎及该系列&#xff08;MergeTree&#xff09;中的其他引擎。MergeTree 系列的引擎被设计用于插入极大量的数据到一张表当中&#xff0c;数据可以以数据片段的形式一个接着一个的快速写入&am…

DonkeyCar [02] - 软件配置 - 上位机(windows)

前言&#xff1a;在windows下配置Donkey Car的上位机&#xff1a; 1 安装miniconda Python Conda是开源的管理系统&#xff0c;Miniconda是conda的开源最小安装。 Donkey的默认安装版本&#xff0c;3.7&#xff0c;Miniconda已经是 最新的版本&#xff0c;是3.10.8吧&#xf…

IB如何选科更有助于大学申请?

如果你准准备选择就读IB课程体系&#xff0c;IB选科颇为重要的&#xff0c;选课对于提升自己的竞争力是非常重要的&#xff0c;可以说合理的选课&#xff0c;是申请外国大学的奠基石。IB课程不同于其他两种课程体系&#xff0c;它并不以某个国家的课程体系为基础&#xff0c;而…

02、Java 数据结构:时间复杂度与空间复杂度

时间复杂度与空间复杂度1 场景理解1.1 场景11.2 场景21.3 场景31.4 场景41.5 代码实现2 时间复杂度2.1 渐进时间复杂度2.2 从基本操作执行次数推导出时间复杂度2.3 两种方法来计算2.4 四个场景的时间复杂度分析2.5 大 O 表达式的优劣3 空间复杂度4 时间复杂度和空间按复杂度关系…

《计算机网络》——第四章知识点

第四章思维导图如下&#xff1b; 网络层向上只提供灵活的、无连接的、尽最大努力交付的数据报服务&#xff0c;主要任务是把分组&#xff08;IP数据报)从通过路由选择与转发从源端传到目的端&#xff0c;为分组交换网上的不同主机提供通信服务。 互联网可以由多种异构网络互连…

基于Shell编程完成定时备份数据库,看这篇就够了

一. 前言 最近文哥班里有一个学员面试成功上岸&#xff0c;在公司开发时遇到了这么一个需求&#xff1a;领导要求他编写一个shell脚本&#xff0c;完成定时备份数据库的需求。由于他对linux以及shell编程不是很了解&#xff0c;这位学员感到束手无策&#xff0c;于是就求助文哥…

List的介绍

目录 1.什么是List 2.常见接口介绍 3.List 1.什么是List 在集合框架中&#xff0c;List是一个接口&#xff0c;继承自Collection。Collection也是一个接口&#xff0c;该接口规范了后续容器中常用的一些方法&#xff0c;具体如下所示 Iterable也是一个接口&#xff0c;表示实…